├── README.md
├── p0_taxi-v2
    ├── README.md
    ├── agent.py
    ├── images
    │   ├── all_perf.png
    │   ├── expected_sarsa_algo.png
    │   ├── expected_sarsa_perf.png
    │   ├── expected_sarsa_update_rule.png
    │   ├── sarsa_algo.png
    │   ├── sarsa_perf.png
    │   ├── sarsa_update_rule.png
    │   ├── sarsamax_algo.png
    │   ├── sarsamax_perf.png
    │   ├── sarsamax_update_rule.png
    │   ├── taxi-game.gif
    │   └── taxi_game_gif.gif
    ├── main.py
    └── monitor.py
├── p1_navigation
    ├── Navigation.ipynb
    ├── README.md
    ├── config.json
    ├── dqn_agent.py
    ├── main.py
    ├── model.py
    ├── report.pdf
    ├── requirements.txt
    ├── saved
    │   └── DQN_exp
    │   │   └── model_trained_solved.pth
    └── utils.py
├── p2_continuous_control
    ├── Continuous_Control.ipynb
    ├── README.md
    ├── config.json
    ├── ddpg_agent.py
    ├── images
    │   └── reacher_gif.gif
    ├── models.py
    ├── report.pdf
    ├── requirements.txt
    └── saved
    │   └── DDPG_exp
    │       ├── checkpoint_actor_solved.pth
    │       └── checkpoint_critic_solved.pth
└── p3_collab_compet
    ├── DDPGAgents.py
    ├── OUNoise.py
    ├── README.md
    ├── ReplayBuffer.py
    ├── Tennis.ipynb
    ├── config.json
    ├── images
        └── tennis_gif.gif
    ├── report.pdf
    ├── requirements.txt
    ├── saved
        └── DDPGAgents_exp
        │   ├── checkpoint_actor_solved.pth
        │   └── checkpoint_critic_solved.pth
    └── utils.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Deep Reinforcement Learning Nanodegree Udacity
 2 | 
 3 | This repository contains project files for Udacity's Deep Reinforcement Learning Nanodegree program
 4 | 
 5 | ## Projects
 6 | 
 7 | ### Reinforcement Learning
 8 | >[P0_Taxi](https://github.com/vmelan/DRLND-udacity/tree/master/p0_taxi-v2)
 9 | 
10 | For this project, we use OpenAI Gym Taxi-v2 environment to design an algorithm to teach a taxi agent to navigate a small gridworld using Reinforcement Learning methods.
11 | 
12 | ### Navigation
13 | >[P1_Navigation](https://github.com/vmelan/DRLND-udacity/tree/master/p1_navigation)
14 | 
15 | For this project, we train an agent to navigate and collect bananas in a large, square world using Deep Q-Network (DQN).
16 | 
17 | ### Continuous Control 
18 | >[P2_Continuous Control](https://github.com/vmelan/DRLND-udacity/tree/master/p2_continuous_control)
19 | 
20 | For this project, we train a double-jointed arm agent to follow a target location using Deeo Distributed Policy
21 | Gradient (DDPG).
22 | 
23 | ### Collaboration and Competition
24 | >[P3_Collaboration Competition](https://github.com/vmelan/DRLND-udacity/tree/master/p3_collab_compet)
25 | 
26 | For this project, we train a pair of agents to play tennis using DDPG with shared replay buffer.
27 | 
28 | 


--------------------------------------------------------------------------------
/p0_taxi-v2/README.md:
--------------------------------------------------------------------------------
  1 | # Project: OpenAI Gym's Taxi-v2 Task
  2 | 
  3 | For this project, we use OpenAI Gym Taxi-v2 environment to design an 
  4 | algorithm to teach a taxi agent to navigate a small gridworld.
  5 | 
  6 | <p align="center">
  7 | 	<img src="images/taxi_game_gif.gif" width=15% height=25%>
  8 | </p>
  9 | 
 10 | ## Problem Statement
 11 | This problem comes from the paper <b>Hierarchical Reinforcement 
 12 | Learning with the MAXQ Value Function Decomposition</b> by Tom Dietterich (link
 13 | to the paper: https://arxiv.org/pdf/cs/9905014.pdf), section <b>3.1 A Motivation Example</b>.
 14 | 
 15 |  There are four specially-designated locations in
 16 | this world, marked as R(ed), B(lue), G(reen), and Y(ellow). The taxi problem is episodic. In
 17 | each episode, the taxi starts in a randomly-chosen square. There is a passenger at one of the
 18 | four locations (chosen randomly), and that passenger wishes to be transported to one of the four
 19 | locations (also chosen randomly). The taxi must go to the passenger’s location (the “source”), pick
 20 | up the passenger, go to the destination location (the “destination”), and put down the passenger
 21 | there. (To keep things uniform, the taxi must pick up and drop off the passenger even if he/she
 22 | is already located at the destination!) The episode ends when the passenger is deposited at the
 23 | destination location.
 24 | 
 25 | There are six primitive actions in this domain: (a) four navigation actions that move the taxi
 26 | one square North, South, East, or West, (b) a Pickup action, and (c) a Putdown action. Each action
 27 | is deterministic. There is a reward of −1 for each action and an additional reward of +20 for
 28 | successfully delivering the passenger. There is a reward of −10 if the taxi attempts to execute the
 29 | Putdown or Pickup actions illegally. If a navigation action would cause the taxi to hit a wall, the
 30 | action is a no-op, and there is only the usual reward of −1.
 31 | 	
 32 | We seek a policy that maximizes the total reward per episode. There are 500 possible states:
 33 | 25 squares, 5 locations for the passenger (counting the four starting locations and the taxi), and 4
 34 | destinations.
 35 | 
 36 | ## Files
 37 | - `agent.py`: Agent class in which we will develop our reinforcement learning methods
 38 | - `monitor.py`: The `interace` function tests how well the agent learns from interaction with the environment
 39 | - `main.py` : Main file to run the project for checking the performance of the agent 
 40 | 
 41 | ## Temporal-Difference (TD) Control Methods
 42 | While **Monte-Carlo** approaches requires we run the agent for the whole episode before making any decisions,
 43 | this solution is no longer viable with **continuous** tasks that does not have any terminal state, as well as **episodic** tasks for cases when we 
 44 | do not want to wait for the terminal state before making any decisions in the environment's episode.
 45 | 
 46 | This is where Temporal-Difference (TD) Control Methods step in, they update estimates based in part 
 47 | on other learned estimates, without waiting for the final outcome. As such, TD methods will update the 
 48 | **Q-table** after every time steps.  
 49 | 
 50 | ### Sarsa
 51 | The Sarsa update rule is the following:
 52 | 
 53 | <p align="center">
 54 | 	<img src="images/sarsa_update_rule.png">
 55 | </p>
 56 | 
 57 | Notice that the action-value update uses the **S**tate, **A**ction, **R**eward, next **S**tate, next **R**eward hence the name
 58 | of the algorithm **Sarsa(0)** or simply **Sarsa**. 
 59 | 
 60 | <p align="center">
 61 | 	<img src="images/sarsa_algo.png" width=75%, height=50%>
 62 | </p>
 63 | 
 64 | Here is the performance of Sarsa on the Taxi task :
 65 | <p align="center">
 66 | 	<img src="images/sarsa_perf.png" width=50%, height=25%>
 67 | </p>
 68 | The average reward over the last 100 episodes keeps improving until the 2000th episodes, where it finally 
 69 | reaches convergence and stops improving
 70 | 
 71 | ### Expected Sarsa
 72 | The Expected Sarsa update rule is the following:
 73 | 
 74 | <p align="center">
 75 | 	<img src="images/expected_sarsa_update_rule.png">
 76 | </p>
 77 | Expected Sarsa uses the expected value of the next state-action pair, where the expectation takes into accoun the probability that the agent selects each possible action from the next state.
 78 | 
 79 | <p align="center">
 80 | 	<img src="images/expected_sarsa_algo.png" width=75%, height=50%>
 81 | </p>
 82 | 
 83 | Here is the performance of Expected Sarsa on the Taxi task : 
 84 | <p align="center">
 85 | 	<img src="images/expected_sarsa_perf.png" width=50%, height=25%>
 86 | </p>
 87 | The resulting graph is noisier than Sarsa, due to the fact we are averaging over all the possible 
 88 | actions in the next state. Convergence takes more time and there are gradually some, albeit small, improvement.
 89 | 
 90 | ### Sarsamax (or Q-Learning)
 91 | The Sarsamax (or Q-Learning) update rule is the following:
 92 | 
 93 | <p align="center">
 94 | 	<img src="images/sarsamax_update_rule.png">
 95 | </p>
 96 | In Sarsamax, the update rule attempts to approximate the optimal value function
 97 | at every time step.
 98 | 
 99 | <p align="center">
100 | 	<img src="images/sarsamax_algo.png" width=75%, height=50%>
101 | </p>
102 | Here is the performance of Sarsamax on the Taxi task: 
103 | <p align="center">
104 | 	<img src="images/sarsamax_perf.png" width=50%, height=25%>
105 | </p>
106 | Sarsamax is smoother and follows the same trend as Sarsa. 
107 | 
108 | ### Overview
109 | Sarsa, Expected Sarsa and Sarsamax have been trained for 20000 episodes, and we can visualize
110 | in the same graph their performance :
111 | <p align="center">
112 | 	<img src="images/all_perf.png" width=50%, height=25%>
113 | </p>


--------------------------------------------------------------------------------
/p0_taxi-v2/agent.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | from collections import defaultdict
 3 | 
 4 | class Agent:
 5 | 
 6 |     def __init__(self, nA=6):
 7 |         """ Initialize agent.
 8 | 
 9 |         Params
10 |         ======
11 |         - nA: number of actions available to the agent
12 |         """
13 |         self.nA = nA
14 |         self.Q = defaultdict(lambda: np.zeros(self.nA))
15 |         
16 |         self.eps = 1.0
17 |         self.eps_decay = 0.99
18 |         self.eps_min = 0.005
19 |     
20 |         self.alpha = 0.1
21 |         self.gamma = 0.9
22 |         
23 |     def get_policy(self, Q_s):
24 |         """ Obtain the action probabilities corresponding to epsilon-greedy policies """
25 |         self.eps = max(self.eps*self.eps_decay, self.eps_min) 
26 |         policy_s = np.ones(self.nA) * (self.eps / self.nA)
27 |         best_a = np.argmax(Q_s)
28 |         policy_s[best_a] = 1 - self.eps + (self.eps / self.nA)
29 |         
30 |         return policy_s
31 |         
32 |     def select_action(self, state):
33 |         """ Given the state, select an action.
34 | 
35 |         Params
36 |         ======
37 |         - state: the current state of the environment
38 | 
39 |         Returns
40 |         =======
41 |         - action: an integer, compatible with the task's action space
42 |         """
43 |         policy_s = self.get_policy(self.Q[state])
44 |         action = np.random.choice(np.arange(self.nA), p=policy_s)
45 |         
46 |         return action 
47 |         
48 |     def step(self, state, action, reward, next_state, done):
49 |         """ Update the agent's knowledge, using the most recently sampled tuple.
50 | 
51 |         Params
52 |         ======
53 |         - state: the previous state of the environment
54 |         - action: the agent's previous choice of action
55 |         - reward: last reward received
56 |         - next_state: the current state of the environment
57 |         - done: whether the episode is complete (True or False)
58 |         
59 |         """
60 |         ## Using update rule of Sarsamax (Q-Learning)
61 | 
62 |         if not done:
63 |             self.Q[state][action] = self.Q[state][action] + self.alpha * (reward + (self.gamma * np.max(self.Q[next_state])) - self.Q[state][action])
64 |         if done:
65 |             self.Q[state][action] = self.Q[state][action] + self.alpha * (reward + self.gamma * 0 - self.Q[state][action])
66 |             
67 |         


--------------------------------------------------------------------------------
/p0_taxi-v2/images/all_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/all_perf.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/expected_sarsa_algo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/expected_sarsa_algo.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/expected_sarsa_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/expected_sarsa_perf.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/expected_sarsa_update_rule.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/expected_sarsa_update_rule.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsa_algo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsa_algo.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsa_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsa_perf.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsa_update_rule.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsa_update_rule.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsamax_algo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsamax_algo.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsamax_perf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsamax_perf.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/sarsamax_update_rule.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/sarsamax_update_rule.png


--------------------------------------------------------------------------------
/p0_taxi-v2/images/taxi-game.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/taxi-game.gif


--------------------------------------------------------------------------------
/p0_taxi-v2/images/taxi_game_gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p0_taxi-v2/images/taxi_game_gif.gif


--------------------------------------------------------------------------------
/p0_taxi-v2/main.py:
--------------------------------------------------------------------------------
 1 | from agent import Agent
 2 | from monitor import interact
 3 | import gym
 4 | import numpy as np
 5 | import matplotlib.pyplot as plt
 6 | 
 7 | import sys
 8 | from collections import defaultdict
 9 | import time
10 | 
11 | def plot_performance(num_episodes, avg_rewards, label, disp_plot=True):
12 | 	plt.plot(np.linspace(0, num_episodes, len(avg_rewards),endpoint=False), np.asarray(avg_rewards), label=label)
13 | 	plt.xlabel('Episode Number')
14 | 	plt.ylabel('Average Reward (Over Next %d Episodes)' % (100))
15 | 	plt.title(label + " " + "performance")
16 | 	if disp_plot: plt.show()
17 | 
18 | def plot_all_performances(num_episodes, all_avg_rewards, title):
19 | 	for (avg_reward, method) in zip(all_avg_rewards, ['Sarsa', 'Expected Sarsa', 'Sarsamax (Q-Learning)']):
20 | 		plot_performance(num_episodes, avg_reward, method, disp_plot=False)
21 | 	plt.title(title)
22 | 	plt.legend(loc='best')
23 | 	plt.show()
24 | 
25 | def main():
26 | 	env = gym.make('Taxi-v2')
27 | 	num_episodes = 20000
28 | 
29 | 	## Sarsa
30 | 	agent = Agent(method='Sarsa')
31 | 	sarsa_avg_rewards, sarsa_best_avg_reward = interact(env, agent, num_episodes=num_episodes)
32 | 	plot_performance(num_episodes, sarsa_avg_rewards, "Sarsa", disp_plot=True)
33 | 
34 | 	# ## Expected Sarsa
35 | 	agent = Agent(method='Expected Sarsa')
36 | 	exp_sarsa_avg_rewards, exp_sarsa_best_avg_reward = interact(env, agent, num_episodes=num_episodes)
37 | 	plot_performance(num_episodes, exp_sarsa_avg_rewards, "Expected Sarsa", disp_plot=True)
38 | 
39 | 	## Q-Learning
40 | 	agent = Agent(method='Q-Learning')
41 | 	sarsamax_avg_rewards, sarsamax_best_avg_reward = interact(env, agent, num_episodes=num_episodes)
42 | 	plot_performance(num_episodes, sarsamax_avg_rewards, "Sarsamax (Q-Learning)", disp_plot=True)
43 | 
44 | 	## All performances
45 | 	plot_all_performances(num_episodes, [sarsa_avg_rewards, exp_sarsa_avg_rewards, sarsamax_avg_rewards], 
46 | 		title="Comparison of Temporal Difference control methods")
47 | 
48 | if __name__ == '__main__':
49 | 	main()


--------------------------------------------------------------------------------
/p0_taxi-v2/monitor.py:
--------------------------------------------------------------------------------
 1 | from collections import deque
 2 | import sys
 3 | import math
 4 | import numpy as np
 5 | 
 6 | def interact(env, agent, num_episodes=20000, window=100):
 7 | # def interact(env, agent, num_episodes=1000, window=100):
 8 | 
 9 |     """ Monitor agent's performance.
10 |     
11 |     Params
12 |     ======
13 |     - env: instance of OpenAI Gym's Taxi-v1 environment
14 |     - agent: instance of class Agent (see Agent.py for details)
15 |     - num_episodes: number of episodes of agent-environment interaction
16 |     - window: number of episodes to consider when calculating average rewards
17 | 
18 |     Returns
19 |     =======
20 |     - avg_rewards: deque containing average rewards
21 |     - best_avg_reward: largest value in the avg_rewards deque
22 |     """
23 |     # initialize average rewards
24 |     avg_rewards = deque(maxlen=num_episodes)
25 |     # initialize best average reward
26 |     best_avg_reward = -math.inf
27 |     # initialize monitor for most recent rewards
28 |     samp_rewards = deque(maxlen=window)
29 |     # for each episode
30 |     for i_episode in range(1, num_episodes+1):
31 |         # begin the episode
32 |         state = env.reset()
33 |         # initialize the sampled reward
34 |         samp_reward = 0
35 |         while True:
36 |             # agent selects an action
37 |             action = agent.select_action(state)
38 |             # agent performs the selected action
39 |             next_state, reward, done, _ = env.step(action)
40 |             # agent performs internal updates based on sampled experience
41 |             agent.step(state, action, reward, next_state, done)
42 |             # update the sampled reward
43 |             samp_reward += reward
44 |             # update the state (s <- s') to next time step
45 |             state = next_state
46 |             if done:
47 |                 # save final sampled reward
48 |                 samp_rewards.append(samp_reward)
49 |                 break
50 |         if (i_episode >= 100):
51 |             # get average reward from last 100 episodes
52 |             avg_reward = np.mean(samp_rewards)
53 |             # append to deque
54 |             avg_rewards.append(avg_reward)
55 |             # update best average reward
56 |             if avg_reward > best_avg_reward:
57 |                 best_avg_reward = avg_reward
58 |         # monitor progress
59 |         print("\rEpisode {}/{} || Best average reward {}".format(i_episode, num_episodes, best_avg_reward), end="")
60 |         sys.stdout.flush()
61 |         # check if task is solved (according to OpenAI Gym)
62 |         if best_avg_reward >= 9.7:
63 |             print('\nEnvironment solved in {} episodes.'.format(i_episode), end="")
64 |             break
65 |         if i_episode == num_episodes: print('\n')
66 |     
67 |     return avg_rewards, best_avg_reward


--------------------------------------------------------------------------------
/p1_navigation/README.md:
--------------------------------------------------------------------------------
 1 | # Project : Navigation
 2 | 
 3 | ## Description 
 4 | For this project, we train an agent to navigate and collect bananas in a large, 
 5 | square world.
 6 | 
 7 | ## Problem statement 
 8 | A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided 
 9 | for collecting a blue banana. Thus, the goal of the agent is to collect 
10 | as many yellow bananas as possible while avoiding blue bananas.
11 | 
12 | The state space has 37 dimensions, and contains the agent's velocity, along
13 | with ray-based perception of objects around the agent's forward
14 | direction. Given this information, the agent has to learn how to best select 
15 | actions. 
16 | Four discrete actions are available, corresponding to: 
17 | - `0` - move forward
18 | - `1` - move backward
19 | - `2` - turn left
20 | - `3` - turn right
21 | The task is episodic, and in order to solve the environment, the 
22 | agent must get an average score of +13 over 100 consecutive episodes.
23 | 
24 | ## Files
25 | - `Navigation.ipynb`: Notebook used to control and train the agent 
26 | - `main.py`: Main script used to control and train the agent for experimentation
27 | - `dqn_agent.py`: Create an Agent class that interacts with and learns from the environment 
28 | - `model.py`: Q-network class used to map state to action values 
29 | - `config.json`: Configuration file to store variables and paths
30 | - `utils.py`: Helper functions 
31 | - `report.pdf`: Technical report 
32 | 
33 | ## Dependencies
34 | To be able to run this code, you will need an environment with Python 3 and 
35 | the dependencies are listed in the `requirements.txt` file so that you can install them
36 | using the following command: 
37 | ```
38 | pip install requirements.txt
39 | ``` 
40 | 
41 | Furthermore, you need to download the environment from one of the links below. You need only to select
42 | the environment that matches your operating system:
43 | - Linux : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux.zip)
44 | - MAC OSX : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana.app.zip)
45 | - Windows : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Windows_x86_64.zip)
46 | 
47 | ## Running
48 | Run the cells in the notebook `Navigation.ipynb` to train an agent that solves our required
49 | task of collecting bananas.


--------------------------------------------------------------------------------
/p1_navigation/config.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"exp_name": "DQN_exp", 
 3 | 	"cuda": false, 
 4 | 	"gpu": 0, 
 5 | 
 6 | 	"optimizer": {
 7 | 		"optimizer_type": "Adam", 
 8 | 		"betas": [0.9, 0.999],
 9 | 		"optimizer_params": {
10 | 			"lr": 0.0005,
11 | 			"eps": 1e-7,
12 | 			"weight_decay": 0
13 | 		}
14 | 	},
15 | 
16 | 	"GLIE": {    
17 | 		"eps_start": 1.0, 
18 | 		"eps_end": 0.005, 
19 | 		"eps_decay": 0.999
20 | 	},
21 | 
22 | 	"DQN": {
23 | 		"gamma": 0.99, 
24 | 		"tau": 1e-3, 
25 | 		"update_every": 4, 
26 | 		"buffer_size": 5e4 
27 | 	},
28 | 
29 | 	"architecture": {
30 | 		"hidden_layers_units": [500, 200, 100], 
31 | 		"use_dropout": false,
32 | 		"dropout_proba": 0.5
33 | 	},
34 | 
35 | 	"trainer": {
36 | 		"num_episodes": 2000, 
37 | 		"batch_size": 32, 
38 | 		"max_timesteps_per_ep": 1000, 
39 | 		"save_dir": "./saved/", 
40 | 		"save_trained_name": "model_trained", 
41 | 		"save_freq": 500, 
42 | 		"verbose": 1
43 | 	}
44 | 
45 | }


--------------------------------------------------------------------------------
/p1_navigation/dqn_agent.py:
--------------------------------------------------------------------------------
  1 | import numpy as np 
  2 | import random 
  3 | from collections import namedtuple, deque 
  4 | import logging 
  5 | 
  6 | from model import QNetwork 
  7 | 
  8 | import torch 
  9 | import torch.nn.functional as F 
 10 | import torch.optim as optim 
 11 | from utils import pick_device 
 12 | 
 13 | import pdb
 14 | 
 15 | class Agent():
 16 | 	""" Agent used to interact with and learns from the environment """
 17 | 
 18 | 	def __init__(self, state_size, action_size, config):
 19 | 		""" Initialize an Agent object """
 20 | 
 21 | 		self.state_size = state_size
 22 | 		self.action_size = action_size 
 23 | 		self.config = config 
 24 | 
 25 | 		# logging for this class 
 26 | 		self.logger = logging.getLogger(self.__class__.__name__)
 27 | 
 28 | 		# gpu support 
 29 | 		self.device = pick_device(config, self.logger)
 30 | 
 31 | 		## Q-Networks 
 32 | 		self.qnetwork_local = QNetwork(state_size, action_size, config).to(self.device)
 33 | 		self.qnetwork_target = QNetwork(state_size, action_size, config).to(self.device)
 34 | 
 35 | 		## Get optimizer for local network 
 36 | 		self.optimizer = getattr(optim, config["optimizer"]["optimizer_type"])(
 37 | 			self.qnetwork_local.parameters(), 
 38 | 			betas=tuple(config["optimizer"]["betas"]),
 39 | 			**config["optimizer"]["optimizer_params"])
 40 | 
 41 | 		## Replay memory
 42 | 		self.memory = ReplayBuffer(
 43 | 			config=config,
 44 | 			action_size=action_size, 
 45 | 			buffer_size=int(config["DQN"]["buffer_size"]), 
 46 | 			batch_size=config["trainer"]["batch_size"]
 47 | 			)
 48 | 
 49 | 		## Initialize time step (for update every `update_every` steps)
 50 | 		self.t_step = 0
 51 | 
 52 | 
 53 | 	def step(self, state, action, reward, next_state, done):
 54 | 		
 55 | 		# Save experience in replay memory 
 56 | 		self.memory.add(state, action, reward, next_state, done)
 57 | 
 58 | 		# Learn every `update_every` time steps 
 59 | 		self.t_step = (self.t_step + 1) % self.config["DQN"]["update_every"]
 60 | 		if (self.t_step == 0):
 61 | 			# If enough samples are available in memory, get random subset and learn
 62 | 			if len(self.memory) > self.config["trainer"]["batch_size"]:
 63 | 				experiences = self.memory.sample()
 64 | 				self.learn(experiences, self.config["DQN"]["gamma"])
 65 | 
 66 | 
 67 | 
 68 | 	def act(self, state, epsilon):
 69 | 		""" Returns actions for given state as per current policy """
 70 | 		# pdb.set_trace()
 71 | 
 72 | 		# Convert state to tensor
 73 | 		state = torch.from_numpy(state).float().unsqueeze(0).to(self.device)
 74 | 		
 75 | 		## Evaluation mode
 76 | 		self.qnetwork_local.eval()
 77 | 		with torch.no_grad():
 78 | 			# Forward pass of local qnetwork 
 79 | 			action_values = self.qnetwork_local.forward(state)
 80 | 		
 81 | 		## Training mode 
 82 | 		self.qnetwork_local.train()
 83 | 		# Epsilon-greedy action selection 
 84 | 		if random.random() > epsilon:
 85 | 			# Choose the best action (exploitation)
 86 | 			return np.argmax(action_values.cpu().data.numpy())
 87 | 		else:
 88 | 			# Choose random action (exploration)
 89 | 			return random.choice(np.arange(self.action_size))
 90 | 
 91 | 
 92 | 	def learn(self, experiences, gamma):
 93 | 		""" Update value parameters using given batch of experience tuples """
 94 | 		
 95 | 		states, actions, rewards, next_states, dones = experiences 
 96 | 
 97 | 		## TD target
 98 | 		# Get max predicted Q-values (for next states) from target model
 99 | 		# Q_targets_next = torch.argmax(self.qnetwork_target(next_states).detach(), dim=1).unsqueeze(1)
100 |         Q_targets_next = self.qnetwork_target(next_states).detach().max(1)[0].unsqueeze(1)
101 | 		Q_targets_next = Q_targets_next.type(torch.FloatTensor)
102 | 
103 | 		# Compute Q-targets for current states 
104 | 		Q_targets = rewards + (gamma * Q_targets_next * (1 - dones))
105 | 
106 | 		## old value
107 | 		# Get expected Q-values from local model 
108 | 		Q_expected = torch.gather(self.qnetwork_local(states), dim=1, index=actions)
109 | 
110 | 		# Compute loss 
111 | 		loss = F.mse_loss(Q_expected, Q_targets)
112 | 		# Minimize loss 
113 | 		self.optimizer.zero_grad()
114 | 		loss.backward()
115 | 		self.optimizer.step()
116 | 
117 | 		# update target network with a soft update
118 | 		self.soft_update(self.qnetwork_local, self.qnetwork_target, self.config["DQN"]["tau"])
119 | 
120 | 
121 | 
122 | 	def soft_update(self, local_model, target_model, tau):
123 | 		""" 
124 | 		Soft update model parameters
125 | 		θ_target = τ*θ_local + (1 - τ)*θ_target
126 | 
127 |         Parameters
128 |         ----------
129 |             local_model (PyTorch model): weights will be copied from
130 |             target_model (PyTorch model): weights will be copied to
131 |             tau (float): interpolation parameter 
132 | 		"""
133 | 
134 | 		for target_param, local_param in zip(target_model.parameters(), local_model.parameters()):
135 | 			target_param.data.copy_(tau*local_param.data + (1.0 - tau)*target_param.data)
136 | 
137 | 
138 | class ReplayBuffer():
139 | 	""" Fixed-size buffer to store experience tuples """ 
140 | 
141 | 	def __init__(self, config, action_size, buffer_size, batch_size):
142 | 		""" Initialize a ReplayBuffer object """ 
143 | 
144 | 		self.config = config
145 | 		self.action_size = action_size 
146 | 		self.memory = deque(maxlen=buffer_size)
147 | 		self.batch_size = batch_size 
148 | 		self.experience = namedtuple("Experience", 
149 | 			field_names=["state", "action", "reward", "next_state", "done"])
150 | 
151 | 		# logging for this class 
152 | 		self.logger = logging.getLogger(self.__class__.__name__)
153 | 
154 | 		# gpu support 
155 | 		self.device = pick_device(config, self.logger)
156 | 
157 | 		
158 | 	def add(self, state, action, reward, next_state, done):
159 | 		""" Add a new experience to memory """ 
160 | 		e = self.experience(state, action, reward, next_state, done) 
161 | 		self.memory.append(e)
162 | 
163 | 	def sample(self):
164 | 		""" Randomly sample a batch of experiences from memory """ 
165 | 		experiences = random.sample(self.memory, k=self.batch_size)
166 | 
167 | 		states = torch.from_numpy(
168 | 			np.vstack([e.state for e in experiences if e is not None])
169 | 			).float().to(self.device)
170 | 		actions = torch.from_numpy(
171 | 			np.vstack([e.action for e in experiences if e is not None])
172 | 			).long().to(self.device)
173 | 		rewards = torch.from_numpy(
174 | 			np.vstack([e.reward for e in experiences if e is not None])
175 | 			).float().to(self.device)
176 | 		next_states = torch.from_numpy(
177 | 			np.vstack([e.next_state for e in experiences if e is not None])
178 | 			).float().to(self.device)
179 | 		dones = torch.from_numpy(
180 | 			np.vstack([e.done for e in experiences if e is not None]).astype(np.uint8)
181 | 			).float().to(self.device)
182 | 
183 | 		return (states, actions, rewards, next_states, dones)
184 | 
185 | 
186 | 	def __len__(self):
187 | 		""" Return the current size of internal memory """
188 | 		return len(self.memory)
189 | 


--------------------------------------------------------------------------------
/p1_navigation/main.py:
--------------------------------------------------------------------------------
  1 | import json
  2 | import logging
  3 | import torch
  4 | import numpy as np
  5 | from collections import deque
  6 | from dqn_agent import Agent
  7 | from utils import ensure_dir
  8 | import matplotlib.pyplot as plt
  9 | from unityagents import UnityEnvironment
 10 | 
 11 | plt.ion()
 12 | 
 13 | 
 14 | def dqn(agent,
 15 |         brain_name,
 16 |         config,
 17 |         n_episodes, 
 18 |         max_timesteps_per_ep, 
 19 |         eps_start, 
 20 |         eps_end, 
 21 |         eps_decay
 22 |         ):
 23 |     
 24 |     """
 25 |     Deep Q-Learning
 26 |     """
 27 |     logger = logging.getLogger('dqn') # logger 
 28 |     flag = False # When environment is technically solved
 29 |     # Save path
 30 |     save_path = config["trainer"]["save_dir"] + config["exp_name"] + "/" 
 31 |     ensure_dir(save_path)    
 32 |     scores = [] # list containing scores from each episodes
 33 |     scores_window = deque(maxlen=100)
 34 |     epsilon = eps_start # init epsilon
 35 |     
 36 |     for i_episode in range(1, n_episodes + 1):
 37 |         # reset the environment
 38 |         env_info = env.reset(train_mode=True)[brain_name]
 39 |         # get the current state
 40 |         state = env_info.vector_observations[0]
 41 |         score = 0
 42 |         for t in range(max_timesteps_per_ep):
 43 |             # choose action based on epsilon-greedy policy
 44 |             action = agent.act(state, epsilon) 
 45 |             # send the action to the environment
 46 |             env_info = env.step(action)[brain_name] 
 47 |             # get the next state
 48 |             next_state = env_info.vector_observations[0]
 49 |             # get the reward
 50 |             reward = env_info.rewards[0]
 51 |             # see if episode has finished 
 52 |             done = env_info.local_done[0]
 53 |             # step 
 54 |             agent.step(state, action, reward, next_state, done)
 55 |             # cumulative rewards into score variable
 56 |             score += reward
 57 |             # get next_state and set it to state
 58 |             state = next_state
 59 |             
 60 |             if done: 
 61 |                 break
 62 |                 
 63 |         # Update epsilon 
 64 |         epsilon = max(eps_decay*epsilon, eps_end)
 65 |         
 66 |         # save most recent score
 67 |         scores.append(score)
 68 |         scores_window.append(score)
 69 |         
 70 |         logger.info('\rEpisode {}\tAverage Score: {:.3f}'.format(i_episode, np.mean(scores_window)))
 71 | 
 72 |         if (i_episode % 100 == 0): 
 73 |             logger.info("\rEpisode {}\tAverage Score: {:.3f}".format(i_episode, \
 74 |                                                                np.mean(scores_window)))
 75 |         
 76 |         # Save occasionnally
 77 |         if (i_episode % config["trainer"]["save_freq"] == 0):
 78 | 
 79 |             torch.save(agent.qnetwork_local.state_dict(), save_path + 
 80 |                 config["trainer"]["save_trained_name"] + "_" + str(i_episode) + ".pth")
 81 |         
 82 |         # Check if environment solved (if not already)
 83 |         if not flag:
 84 |             if (np.mean(scores_window) >= 13.0):
 85 |                 logger.info('\nEnvironment solved in {:d} episodes!\tAverage Score: {:.3f}'.format(
 86 |                 i_episode-100, np.mean(scores_window)))        
 87 |                 # Save solved model
 88 |                 torch.save(agent.qnetwork_local.state_dict(), save_path + 
 89 |                         config["trainer"]["save_trained_name"] + "_solved.pth")
 90 |                 flag = True
 91 |     
 92 |     return scores
 93 | 
 94 | if __name__ == '__main__':
 95 | 	# Configure logging for all loggers
 96 | 	logging.basicConfig(level=logging.INFO, format='')
 97 | 
 98 | 	# Load config file
 99 | 	with open("config.json", "r") as f:
100 |    		config = json.load(f)
101 | 
102 |     # Start the environment
103 | 	env = UnityEnvironment(file_name="./Banana_Windows_x86_64/Banana.exe")
104 | 
105 |     # get the default brain
106 | 	brain_name = env.brain_names[0]
107 | 	brain = env.brains[brain_name]
108 | 
109 | 	# Create agent
110 | 	agent = Agent(state_size=37, action_size=4, config=config)
111 | 
112 | 	# Train the agent 
113 | 	scores = dqn(agent=agent, 
114 | 		brain_name=brain_name, 
115 | 		config=config, 
116 | 		n_episodes=config["trainer"]["num_episodes"],
117 | 		max_timesteps_per_ep=config["trainer"]["max_timesteps_per_ep"],
118 | 		eps_start=config["GLIE"]["eps_start"],
119 | 		eps_end=config["GLIE"]["eps_end"],
120 | 		eps_decay=config["GLIE"]["eps_decay"]
121 | 		)
122 | 
123 | 	# Close the environment 
124 | 	env.close()


--------------------------------------------------------------------------------
/p1_navigation/model.py:
--------------------------------------------------------------------------------
 1 | import torch 
 2 | import torch.nn as nn
 3 | import torch.nn.functional as F 
 4 | 
 5 | 
 6 | class QNetwork(nn.Module):
 7 | 	""" Policy model that maps state to actions """
 8 | 
 9 | 	def __init__(self, state_size, action_size, config):
10 | 		""" Initialize parameters and build model """
11 | 		super().__init__()
12 | 
13 | 
14 | 		# self.fc1 = nn.Linear(state_size, fc1_units)
15 | 		# self.fc2 = nn.Linear(fc1_units, fc2_units)
16 | 		# self.fc3 = nn.Linear(fc2_units, action_size)
17 | 
18 | 		self.config = config 
19 | 		# Retrieve variable from config file
20 | 		hidden_layers_units = config["architecture"]["hidden_layers_units"]
21 | 		dropout_proba = config["architecture"]["dropout_proba"]
22 | 
23 | 		# Add the first layer
24 | 		self.layers = nn.ModuleList([nn.Linear(state_size, hidden_layers_units[0])])
25 | 
26 | 		# Add a variable number of more hidden layers 
27 | 		layer_sizes = zip(hidden_layers_units[:-1], hidden_layers_units[1:])
28 | 		self.layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
29 | 
30 | 		# Add last layer 
31 | 		self.output = nn.Linear(hidden_layers_units[-1], action_size)
32 | 
33 | 		# Dropout
34 | 		self.dropout = nn.Dropout(p=dropout_proba)
35 | 
36 | 	def forward(self, x):
37 | 		""" Forward pass """
38 | 		# x = F.relu(self.fc1(state))
39 | 		# x = F.relu(self.fc2(x))
40 | 		# x = self.fc3(x)
41 | 
42 | 		for layer in self.layers:
43 | 			x = F.relu(layer(x))
44 | 			if self.config["architecture"]["use_dropout"]:
45 | 				x = self.dropout(x)
46 | 
47 | 		x = self.output(x)
48 | 
49 | 		return x
50 | 
51 | if __name__ == '__main__':
52 | 	import json 
53 | 	with open("config.json", "r") as f:
54 | 		config = json.load(f)
55 | 
56 | 	net = QNetwork(state_size=37, action_size=4, config=config)
57 | 	print("net:", net)
58 | 


--------------------------------------------------------------------------------
/p1_navigation/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p1_navigation/report.pdf


--------------------------------------------------------------------------------
/p1_navigation/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib
2 | numpy>=1.11.0
3 | jupyter
4 | unityagents==0.4.0
5 | torch==0.4.0
6 | ipykernel
7 | 


--------------------------------------------------------------------------------
/p1_navigation/saved/DQN_exp/model_trained_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p1_navigation/saved/DQN_exp/model_trained_solved.pth


--------------------------------------------------------------------------------
/p1_navigation/utils.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import torch
 3 | 
 4 | 
 5 | def pick_device(config, logger):
 6 | 	""" Pick device """ 
 7 | 	if config["cuda"] and not torch.cuda.is_available():
 8 | 		logger.warning("Warning: There's no CUDA support on this machine,"
 9 | 			"training is performed on cpu.")
10 | 		device = torch.device("cpu")
11 | 	elif not config["cuda"] and torch.cuda.is_available():
12 | 		logger.info("Training is performed on cpu by user's choice")
13 | 		device = torch.device("cpu")
14 | 	elif not config["cuda"] and not torch.cuda.is_available():
15 | 		logger.info("Training on cpu")
16 | 		device = torch.device("cpu")
17 | 	else:
18 | 		logger.info("Training on gpu")
19 | 		device = torch.device("cuda:" + str(config["gpu"]))	
20 | 
21 | 	return device 
22 | 
23 | def ensure_dir(path):
24 | 	if not os.path.exists(path):
25 | 		os.makedirs(path)


--------------------------------------------------------------------------------
/p2_continuous_control/README.md:
--------------------------------------------------------------------------------
 1 | # Project : Continuous Control 
 2 | 
 3 | ## Description 
 4 | For this project, we train a double-jointed arm agent to follow a target location.
 5 | 
 6 | <p align="center">
 7 | 	<img src="images/reacher_gif.gif" width=50% height=50%>
 8 | </p>
 9 | 
10 | ## Problem Statement 
11 | A reward of +0.1 is provided for each step that the agent's hands is in the goal location.
12 | Thus, the goal of the agent is to maintain its position at
13 | the target location for as many
14 | steps as possible.
15 | 
16 | The observation space consists of 33 variables corresponding to position, 
17 | rotation, velocity, and angular velocities of the arm. 
18 | Each action is a vector with four numbers, corresponding to torque
19 | applicable to two joints. Every 
20 | entry in the action vector should be a number between -1 and 1. 
21 | 
22 | The task is episodic, with 1000 timesteps per episode. In order to solve
23 | the environment, the agent must get an average score of +30 over 100 consecutive
24 | episodes.
25 | 
26 | ## Files 
27 | - `Continuous_Control.ipynb`: Notebook used to control and train the agent 
28 | - `ddpg_agent.py`: Create an Agent class that interacts with and learns from the environment 
29 | - `model.py`: Actor and Critic classes  
30 | - `config.json`: Configuration file to store variables and paths
31 | - `utils.py`: Helper functions 
32 | - `report.pdf`: Technical report
33 | 
34 | ## Dependencies
35 | To be able to run this code, you will need an environment with Python 3 and 
36 | the dependencies are listed in the `requirements.txt` file so that you can install them
37 | using the following command: 
38 | ```
39 | pip install requirements.txt
40 | ``` 
41 | 
42 | Furthermore, you need to download the environment from one of the links below. You need only to select
43 | the environment that matches your operating system:
44 | - Linux : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/one_agent/Reacher_Linux.zip)
45 | - MAC OSX : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher.app.zip)
46 | - Windows : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P2/Reacher/Reacher_Windows_x86_64.zip)
47 | 
48 | ## Running
49 | Run the cells in the notebook `Continuous_Control.ipynb` to train an agent that solves our required
50 | task of moving the double-jointed arm.


--------------------------------------------------------------------------------
/p2_continuous_control/config.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"exp_name": "DDPG_exp", 
 3 | 	"cuda": true, 
 4 | 	"gpu": 0,
 5 | 
 6 | 	"optimizer_actor": {
 7 | 		"optimizer_type": "Adam", 
 8 | 		"betas": [0.9, 0.999], 
 9 | 		"optimizer_params": {
10 | 			"lr": 1e-4, 
11 | 			"eps": 1e-7, 
12 | 			"weight_decay": 0 
13 | 		}
14 | 	},
15 | 
16 | 	"optimizer_critic": {
17 | 		"optimizer_type": "Adam", 
18 | 		"betas": [0.9, 0.999], 
19 | 		"optimizer_params": {
20 | 			"lr": 1e-4, 
21 | 			"eps": 1e-7, 
22 | 			"weight_decay": 0
23 | 		}
24 | 	},
25 | 
26 | 	"DDPG": {
27 | 		"gamma": 0.99,
28 | 		"tau": 0.001,
29 | 		"buffer_size": 10e6
30 | 	},
31 | 
32 | 	"architecture": {
33 | 		"fc1_units": 250, 
34 | 		"fc2_units": 100
35 | 	},
36 | 
37 | 	"trainer" : {
38 | 		"num_episodes": 500, 
39 | 		"batch_size": 128, 
40 | 		"save_dir": "./saved/", 
41 | 		"save_freq": 200
42 | 	}
43 | }


--------------------------------------------------------------------------------
/p2_continuous_control/ddpg_agent.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import random 
  3 | import copy 
  4 | import logging 
  5 | from collections import namedtuple, deque
  6 | from model import Actor, Critic
  7 | import torch
  8 | import torch.nn.functional as F 
  9 | import torch.optim as optim 
 10 | from utils import pick_device
 11 | 
 12 | 
 13 | class Agent():
 14 | 	""" Agent used to interact with and learns from the environment """
 15 | 
 16 | 	def __init__(self, state_size, action_size, config):
 17 | 		""" Initialize an agent object """
 18 | 
 19 | 		self.state_size = state_size
 20 | 		self.action_size = action_size
 21 | 		self.config = config
 22 | 
 23 | 		# logging for this class 
 24 | 		self.logger = logging.getLogger(self.__class__.__name__)
 25 | 
 26 | 		# gpu support 
 27 | 		self.device = pick_device(config, self.logger)
 28 | 
 29 | 		## Actor local and target networks
 30 | 		self.actor_local = Actor(state_size, action_size, config).to(self.device)
 31 | 		self.actor_target = Actor(state_size, action_size).to(self.device)
 32 | 		self.actor_optimizer = getattr(optim, config["optimizer_actor"]["optimizer_type"])(
 33 | 			self.actor_local.parameters(), 
 34 | 			betas=tuple(config["optimizer_actor"]["betas"], 
 35 | 				**config["optimizer_actor"]["optimizer_params"]))
 36 | 
 37 | 		## Critic local and target networks 
 38 | 		self.critic_local = Critic(state_size, action_size, config).to(self.device)
 39 | 		self.critic_target = Actor(state_size, action_size, config).to(self.device)
 40 | 		self.actor_optimizer = getattr(optim, config["optimizer_critic"]["optimizer_type"])(
 41 | 			self.critic_local.parameters(), 
 42 | 			betas=tuple(config["optimizer_critic"]["betas"], 
 43 | 				**config["optimizer_critic"]optimizer_critic["optimizer_params"]))
 44 | 
 45 | 		## Noise process 
 46 | 		self.noise = OUNoise(action_size)
 47 | 
 48 | 		## Replay memory
 49 | 		self.memory = ReplayBuffe(
 50 | 			config=config, 
 51 | 			action_size=action_size,
 52 | 			buffer_size=int(config["DDPG"]["buffer_size"]),
 53 | 			batch_size=config["trainer"]["batch_size"]
 54 | 			)
 55 | 
 56 | 
 57 | 	def step(self, state, action, reward, next_state, done):
 58 | 		""" Save experience in replay memory, 
 59 | 		and use random sample from buffer to learn """
 60 | 
 61 | 		# Save experience in replay memory 
 62 | 		self.memory.add(state, action, reward, next_state, done)
 63 | 
 64 | 		# learn every timestep as long as enough samples are available in memory
 65 | 		if len(self.memory) > self.config["trainer"]["batch_size"]:
 66 | 			experiences = self.memory.sample()
 67 | 			self.learn(experiences, self.config["DDPG"]["gamma"])
 68 | 
 69 | 
 70 | 	def act(self, state):
 71 | 		""" Returns actions for given state as per current policy """
 72 | 
 73 | 		# Convert state to tensor
 74 | 		state = torch.from_numpy(state).float().to(self.device)
 75 | 
 76 | 		## Evaluation mode
 77 | 		self.actor_local.eval()
 78 | 		with torch.no_grad():
 79 | 			# Forward pass of local actor network 
 80 | 			action_values = self.actor_local.forward(state)
 81 | 
 82 | 		## Training mode
 83 | 		self.actor_local.train()
 84 | 		# Add noise to improve exploration to our actor policy
 85 | 		action_values += self.noise.sample()
 86 | 		# Clip action to stay in the range [-1, 1] for our task
 87 | 		action_values = np.clip(action_values, -1, 1)
 88 | 
 89 | 		return action_values
 90 | 
 91 | 
 92 | 	def learn(self, experiences, gamma): 
 93 | 		""" Update value parameters using given batch of experience tuples """
 94 | 
 95 | 		states, actions, rewards, next_states, dones = experiences
 96 | 
 97 | 		## Update actor (policy) network using the sampled policy gradient
 98 | 		# Compute actor loss 
 99 | 		actions_pred = self.actor_local.forward(states)
100 | 		actor_loss = -self.critic_local.forward(states, actions_pred).mean()
101 | 		# Minimize the loss
102 | 		self.actor_optimizer.zero_grad()
103 | 		actor_loss.backward()
104 | 		self.actor_optimizer.step()
105 | 
106 | 		## Update critic (value) network
107 | 		# Get predicted next-state actions and Q-values from target models
108 | 		actions_next = self.actor_target.forward(next_states)
109 | 		Q_targets_next = self.critic_target.forward(next_states, actions_next)
110 | 		# Compute Q-targets for current states
111 | 		Q_targets = rewards + (gamma * Q_targets_next * (1 - dones))
112 | 		# Get expected Q-values from local critic model
113 | 		Q_expected = self.critic_local.forward(states)
114 | 		# Compute loss
115 | 		critic_loss = F.mse_loss(Q_expected, Q_targets)
116 | 		# Minimize the loss
117 | 		self.critic_optimizer.zero_grad()
118 | 		critic_loss.backward()
119 | 		self.critic_optimizer.step()
120 | 
121 | 
122 | 		## Update target networks with a soft update 
123 | 		self.soft_update(self.actor_local, self.self.actor_target, self.config["DDPG"]["tau"])
124 | 		self.soft_update(self.critic_local, self.critic_target, self.config["DDPG"]["tau"])
125 | 
126 | 
127 | 	def soft_update(self, local_model, target_model, tau):
128 | 		""" Soft update model parameters,
129 | 		improves the stability of learning """
130 | 
131 | 		for target_pararam, local_param in zip(target_model.parameters(), local_model.parameters()):
132 | 			target_param.data.copy_(tau*local_param.data + (1.0 - tau)*target_param.data)
133 | 
134 | 
135 | 
136 | class OUNoise():
137 | 	""" Ornstein-Uhlenbeck process """
138 | 
139 | 	def __init__(self, size, mu=0.0, theta=0.15, sigma=0.2):
140 | 		""" Initialize parameters and noise process """
141 | 		self.mu = mu * np.ones(size)
142 | 		self.theta = theta 
143 | 		self.sigma = sigma 
144 | 		self.reset()
145 | 
146 | 	def reset(self):
147 | 		""" Reset the interal state (= noise) to mean (mu). """
148 | 		self.state = copy.copy(self.mu)
149 | 
150 | 	def sample(self):
151 | 		""" Update internal state and return it as a noise sample """
152 | 		x = self.state 
153 | 		dx = self.theta * (self.mu - x) + self.sigma * np.array([random.random() for i in range(len(x))])
154 | 		self.state = x + dx 
155 | 
156 | 		return self.state
157 | 
158 | 
159 | 
160 | 
161 | class ReplayBuffer():
162 | 	""" Fixed-size buffer to store experience tuples """
163 | 
164 | 	def __init__(self, config, action_size, buffer_size, batch_size):
165 | 		""" Initialize a ReplayBuffer object """
166 | 
167 | 		self.config = config 
168 | 		self.action_size = action_size
169 | 		self.memory = deque(maxlen=buffer_size)
170 | 		self.batch_size = batch_size
171 | 		self.experience = namedtuple("Experience", 
172 | 			field_names=["state", "action", "reward", "next_state", "done"])
173 | 
174 | 		# logging for this class 
175 | 		self.logger = logging.getLogger(self.__class__.__name__)
176 | 
177 | 		# gpu support 
178 | 		self.device = pick_device(config, self.logger)
179 | 
180 | 
181 | 	def add(self, state, action, reward, next_state, done):
182 | 		""" Add a new experience to memory """
183 | 		e = self.experience(state, action, reward, next_state, done)
184 | 		self.memory.append(e)
185 | 
186 | 
187 | 	def sample(self):
188 | 		""" Randomly sample a batch of experiences from memory """
189 | 		experiences = random.sample(self.memory, k=self.batch_size)
190 | 
191 | 		states = torch.from_numpy(
192 | 			np.vstack([e.state for e in experiences if e is not None])
193 | 			).float().to(self.device)
194 | 		actions = torch.from_numpy(
195 | 			np.vstack([e.action for e in experiences if e is not None])
196 | 			).float().to(self.device)
197 | 		rewards = torch.from_numpy(
198 | 			np.vstack([e.rewards for e in experiences if e is not None])
199 | 			).float().to(self.device)
200 | 		next_states = torch.from_numpy(
201 | 			np.vstack([e.next_state for e in experiences if e is not None])
202 | 			).float().to(self.device)
203 | 		dones = torch.from_numpy(
204 | 			np.vstack([e.done for e in experiences if e is not None]).astype(uint8)
205 | 			).float().to(self.device)
206 | 
207 | 		return (states, actions, rewards, next_states, dones)
208 | 
209 | 
210 | 	def __len__(self):
211 | 		""" Return the current size of internal memory """
212 | 		return len(self.memory)


--------------------------------------------------------------------------------
/p2_continuous_control/images/reacher_gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/images/reacher_gif.gif


--------------------------------------------------------------------------------
/p2_continuous_control/models.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import torch 
 3 | import torch.nn as nn 
 4 | import torch.nn.function as F 
 5 | 
 6 | 
 7 | class Actor(nn.Module):
 8 | 	""" Actor (Policy) model """
 9 | 
10 | 	def __init__(self, state_size, action_size, config):
11 | 		""" Initalize parameters and build model """
12 | 
13 | 		super(Actor, self).__init__()
14 | 		fc1_units = config["architecture"]["fc1_units"]
15 | 		fc2_units = config["architecture"]["fc2_units"]
16 | 
17 | 		self.fc1 = nn.Linear(in_features=state_size, out_features=fc1_units)
18 | 		self.fc2 = nn.Linear(in_features=fc1_units, out_features=fc2_units)
19 | 
20 | 		# weights initialization 
21 | 		for m in self.modules():
22 | 			if isinstance(m, nn.Linear):
23 | 				# FC layers have weights initialized with Glorot 
24 | 				m.weight = nn.init.xavier_uniform(m.weight, gain=1)
25 | 
26 | 	def forward(self, state):
27 | 		""" Build an actor (policy) network that maps states to actions """
28 | 		x = F.relu(self.fc1(state))
29 | 		x = F.relu(self.fc2(x))
30 | 		x = F.tanh(self.fc3(x)) # outputs are in the range [-1, 1]
31 | 
32 | 		return x
33 | 
34 | 
35 | class Critic(nn.Module):
36 | 	""" Critic (Value) Model """
37 | 
38 | 	def __init__(self, state_size, action_size, config):
39 | 		""" Initialize parameters and build model """
40 | 		super(Critic, self).__init__()
41 | 
42 | 		fc1_units = config["architecture"]["fc1_units"]
43 | 		fc2_units = config["architecture"]["fc2_units"]
44 | 
45 | 		self.fc1 = nn.Linear(in_features=state_size, out_features=fc1_units)
46 | 		self.fc2 = nn.Linear(in_features=fc1_units + action_size, 
47 | 			out_features=fc2_units)
48 | 		self.fc3 = nn.Linear(in_features=fc2_units, 1)
49 | 
50 | 		# weights initialization 
51 | 		for m in self.modules():
52 | 			if isinstance(m, nn.Linear):
53 | 				# FC layers have weights initialized with Glorot 
54 | 				m.weight = nn.init.xavier_uniform(m.weight, gain=1)
55 | 
56 | 	def forward(self, state, action):
57 | 		""" Build a critic (value) network that maps 
58 | 		(state, action) pairs -> Q-values """
59 | 		x = F.relu(self.fc1(state))
60 | 		x = F.relu(self.fc2(torch.cat([x, action], dim=1))) # add action too for the mapping
61 | 		x = F.relu(self.fc3(x))
62 | 
63 | 		return x
64 | 
65 | 


--------------------------------------------------------------------------------
/p2_continuous_control/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/report.pdf


--------------------------------------------------------------------------------
/p2_continuous_control/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/requirements.txt


--------------------------------------------------------------------------------
/p2_continuous_control/saved/DDPG_exp/checkpoint_actor_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/saved/DDPG_exp/checkpoint_actor_solved.pth


--------------------------------------------------------------------------------
/p2_continuous_control/saved/DDPG_exp/checkpoint_critic_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p2_continuous_control/saved/DDPG_exp/checkpoint_critic_solved.pth


--------------------------------------------------------------------------------
/p3_collab_compet/DDPGAgents.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import logging 
  3 | from models import Actor, Critic
  4 | from ReplayBuffer import ReplayBuffer
  5 | from OUNoise import OUNoise
  6 | import torch
  7 | import torch.nn.functional as F 
  8 | import torch.optim as optim 
  9 | from utils import pick_device
 10 | 
 11 | import pdb
 12 | 
 13 | class DDPGAgents():
 14 | 	""" Agent used to interact with and learns from the environment """
 15 | 
 16 | 	def __init__(self, state_size, action_size, config):
 17 | 		""" Initialize an agent object """
 18 | 
 19 | 		self.state_size = state_size
 20 | 		self.action_size = action_size
 21 | 		self.config = config
 22 | 
 23 | 		# retrieve number of agents
 24 | 		self.num_agents = config["DDPG"]["num_agents"]
 25 | 
 26 | 		# logging for this class 
 27 | 		self.logger = logging.getLogger(self.__class__.__name__)
 28 | 
 29 | 		# gpu support 
 30 | 		self.device = pick_device(config, self.logger)
 31 | 
 32 | 		## Actor local and target networks
 33 | 		self.actor_local = Actor(state_size, action_size, config).to(self.device)
 34 | 		self.actor_target = Actor(state_size, action_size, config).to(self.device)
 35 | 		self.actor_optimizer = getattr(optim, config["optimizer_actor"]["optimizer_type"])(
 36 | 			self.actor_local.parameters(), 
 37 | 			betas=tuple(config["optimizer_actor"]["betas"]), 
 38 | 				**config["optimizer_actor"]["optimizer_params"])
 39 | 
 40 | 		## Critic local and target networks 
 41 | 		self.critic_local = Critic(state_size, action_size, config).to(self.device)
 42 | 		self.critic_target = Critic(state_size, action_size, config).to(self.device)
 43 | 		self.critic_optimizer = getattr(optim, config["optimizer_critic"]["optimizer_type"])(
 44 | 			self.critic_local.parameters(), 
 45 | 			betas=tuple(config["optimizer_critic"]["betas"]), 
 46 | 				**config["optimizer_critic"]["optimizer_params"])
 47 | 
 48 | 		## Noise process 
 49 | 		self.noise = OUNoise((self.num_agents, action_size))
 50 | 
 51 | 		## Replay memory
 52 | 		self.memory = ReplayBuffer(
 53 | 			config=config, 
 54 | 			action_size=action_size,
 55 | 			buffer_size=int(config["DDPG"]["buffer_size"]),
 56 | 			batch_size=config["trainer"]["batch_size"]
 57 | 			)
 58 | 
 59 | 
 60 | 	def step(self, state, action, reward, next_state, done):
 61 | 		""" Save experience in replay memory, 
 62 | 		and use random sample from buffer to learn """
 63 | 
 64 | 		# Save experience in replay memory shared by all agents
 65 | 		for agent in range(self.num_agents):
 66 | 			self.memory.add(state[agent, :], 
 67 | 				action[agent, :], 
 68 | 				reward[agent], 
 69 | 				next_state[agent, :], 
 70 | 				done[agent]
 71 | 				)
 72 | 
 73 | 		# learn every timestep as long as enough samples are available in memory
 74 | 		if len(self.memory) > self.config["trainer"]["batch_size"]:
 75 | 			experiences = self.memory.sample()
 76 | 			self.learn(experiences, self.config["DDPG"]["gamma"])
 77 | 
 78 | 
 79 | 	def act(self, states, add_noise=False):
 80 | 		""" Returns actions for given state as per current policy """
 81 | 
 82 | 		# Convert state to tensor²
 83 | 		states = torch.from_numpy(states).float().to(self.device)
 84 | 
 85 | 		# prepare actions numpy array for all agents
 86 | 		actions = np.zeros((self.num_agents, self.action_size))
 87 | 
 88 | 		## Evaluation mode
 89 | 		self.actor_local.eval()
 90 | 		with torch.no_grad():
 91 | 			# Forward pass of local actor network 
 92 | 			for agent, state in enumerate(states):
 93 | 				action_values = self.actor_local.forward(state).cpu().data.numpy()
 94 | 				actions[agent, :] = action_values
 95 | 
 96 | 		# pdb.set_trace()
 97 | 		## Training mode
 98 | 		self.actor_local.train()
 99 | 		if add_noise:
100 | 			# Add noise to improve exploration to our actor policy
101 | 			# action_values += torch.from_numpy(self.noise.sample()).type(torch.FloatTensor).to(self.device)
102 | 			actions += self.noise.sample()
103 | 		# Clip action to stay in the range [-1, 1] for our task
104 | 		actions = np.clip(actions, -1, 1)
105 | 
106 | 		return actions
107 | 
108 | 
109 | 	def learn(self, experiences, gamma): 
110 | 		""" Update value parameters using given batch of experience tuples """
111 | 
112 | 		states, actions, rewards, next_states, dones = experiences
113 | 
114 | 		## Update actor (policy) network using the sampled policy gradient
115 | 		# Compute actor loss 
116 | 		actions_pred = self.actor_local.forward(states)
117 | 		actor_loss = -self.critic_local.forward(states, actions_pred).mean()
118 | 		# Minimize the loss
119 | 		self.actor_optimizer.zero_grad()
120 | 		actor_loss.backward()
121 | 		self.actor_optimizer.step()
122 | 
123 | 		## Update critic (value) network
124 | 		# Get predicted next-state actions and Q-values from target models
125 | 		actions_next = self.actor_target.forward(next_states)
126 | 		Q_targets_next = self.critic_target.forward(next_states, actions_next)
127 | 		# Compute Q-targets for current states
128 | 		Q_targets = rewards + (gamma * Q_targets_next * (1 - dones))
129 | 		# Get expected Q-values from local critic model
130 | 		Q_expected = self.critic_local.forward(states, actions)
131 | 		# Compute loss
132 | 		critic_loss = F.mse_loss(Q_expected, Q_targets)
133 | 		# Minimize the loss
134 | 		self.critic_optimizer.zero_grad()
135 | 		critic_loss.backward()
136 | 		self.critic_optimizer.step()
137 | 
138 | 
139 | 		## Update target networks with a soft update 
140 | 		self.soft_update(self.actor_local, self.actor_target, self.config["DDPG"]["tau"])
141 | 		self.soft_update(self.critic_local, self.critic_target, self.config["DDPG"]["tau"])
142 | 
143 | 
144 | 	def soft_update(self, local_model, target_model, tau):
145 | 		""" Soft update model parameters,
146 | 		improves the stability of learning """
147 | 
148 | 		for target_param, local_param in zip(target_model.parameters(), local_model.parameters()):
149 | 			target_param.data.copy_(tau*local_param.data + (1.0 - tau)*target_param.data)
150 | 
151 | 
152 | 	def reset(self):
153 | 		""" Reset noise """
154 | 		self.noise.reset()
155 | 
156 | 
157 | 
158 | 
159 | 
160 | 
161 | 


--------------------------------------------------------------------------------
/p3_collab_compet/OUNoise.py:
--------------------------------------------------------------------------------
 1 | import random
 2 | import numpy as np 
 3 | import copy
 4 | 
 5 | class OUNoise():
 6 | 	""" Ornstein-Uhlenbeck process """
 7 | 
 8 | 	def __init__(self, size, mu=0.0, theta=0.15, sigma=0.2):
 9 | 		""" Initialize parameters and noise process """
10 | 		self.mu = mu * np.ones(size)
11 | 		self.theta = theta 
12 | 		self.sigma = sigma 
13 | 		self.reset()
14 | 
15 | 	def reset(self):
16 | 		""" Reset the internal state (= noise) to mean (mu). """
17 | 		self.state = copy.copy(self.mu)
18 | 
19 | 	def sample(self):
20 | 		""" Update internal state and return it as a noise sample """
21 | 		x = self.state 
22 | 		dx = self.theta * (self.mu - x) + self.sigma * np.random.standard_normal(self.size)
23 | 		self.state = x + dx 
24 | 
25 | 		return self.state


--------------------------------------------------------------------------------
/p3_collab_compet/README.md:
--------------------------------------------------------------------------------
 1 | # Project : Collaboration and Competition 
 2 | 
 3 | ## Description 
 4 | For this project, we train a pair of agents to play tennis.
 5 | 
 6 | <p align="center">
 7 | 	<img src="images/tennis_gif.gif" width=50% height=50%>
 8 | </p>
 9 | 
10 | ## Problem Statement 
11 | A reward of +0.1 is provided for each step that one of the two agent hits the ball over the net.
12 | A reward of -0.01 is provided an agent lets a nall hit the ground or hits the ball out of bounds.
13 | Thus, the goal of each agent is to keep the ball in play.
14 | 
15 | The observation space consists of 24 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation. Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.
16 | 
17 | The task is episodic. In order to solve
18 | the environment, one of the agent must get an average score of +0.5 over 100 consecutive
19 | episodes.
20 | 
21 | ## Files 
22 | - `Tennis.ipynb`: Notebook used to control and train the agent 
23 | - `DDPGAgents.py`: Create an DDPGAgents class that interacts with and learns from the environment 
24 | - `ReplayBuffer.py`: Replay Buffer class to store the experiences
25 | - `OUNoise.py`: Ornstein Uhlenbeck noise for the actor to improve exploration
26 | - `model.py`: Actor and Critic classes  
27 | - `config.json`: Configuration file to store variables and paths
28 | - `utils.py`: Helper functions 
29 | - `report.pdf`: Technical report
30 | 
31 | ## Dependencies
32 | To be able to run this code, you will need an environment with Python 3 and 
33 | the dependencies are listed in the `requirements.txt` file so that you can install them
34 | using the following command: 
35 | ```
36 | pip install requirements.txt
37 | ``` 
38 | 
39 | Furthermore, you need to download the environment from one of the links below. You need only to select
40 | the environment that matches your operating system:
41 | - Linux : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P3/Tennis/Tennis_Linux.zip)
42 | - MAC OSX : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P3/Tennis/Tennis.app.zip)
43 | - Windows : [link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P3/Tennis/Tennis_Windows_x86_64.zip)
44 | 
45 | ## Running
46 | Run the cells in the notebook `Tennis.ipynb` to train an agent that solves our required
47 | task of moving the double-jointed arm.


--------------------------------------------------------------------------------
/p3_collab_compet/ReplayBuffer.py:
--------------------------------------------------------------------------------
 1 | import logging
 2 | import torch 
 3 | from collections import namedtuple, deque
 4 | import random
 5 | from utils import pick_device
 6 | import numpy as np
 7 | 
 8 | class ReplayBuffer():
 9 | 	""" Fixed-size buffer to store experience tuples """
10 | 
11 | 	def __init__(self, config, action_size, buffer_size, batch_size):
12 | 		""" Initialize a ReplayBuffer object """
13 | 
14 | 		self.config = config 
15 | 		self.action_size = action_size
16 | 		self.memory = deque(maxlen=buffer_size)
17 | 		self.batch_size = batch_size
18 | 		self.experience = namedtuple("Experience", 
19 | 			field_names=["state", "action", "reward", "next_state", "done"])
20 | 
21 | 		# logging for this class 
22 | 		self.logger = logging.getLogger(self.__class__.__name__)
23 | 
24 | 		# gpu support 
25 | 		self.device = pick_device(config, self.logger)
26 | 
27 | 
28 | 	def add(self, state, action, reward, next_state, done):
29 | 		""" Add a new experience to memory """
30 | 		e = self.experience(state, action, reward, next_state, done)
31 | 		self.memory.append(e)
32 | 
33 | 
34 | 	def sample(self):
35 | 		""" Randomly sample a batch of experiences from memory """
36 | 		experiences = random.sample(self.memory, k=self.batch_size)
37 | 
38 | 		states = torch.from_numpy(
39 | 			np.vstack([e.state for e in experiences if e is not None])
40 | 			).float().to(self.device)
41 | 		actions = torch.from_numpy(
42 | 			np.vstack([e.action for e in experiences if e is not None])
43 | 			).float().to(self.device)
44 | 		rewards = torch.from_numpy(
45 | 			np.vstack([e.reward for e in experiences if e is not None])
46 | 			).float().to(self.device)
47 | 		next_states = torch.from_numpy(
48 | 			np.vstack([e.next_state for e in experiences if e is not None])
49 | 			).float().to(self.device)
50 | 		dones = torch.from_numpy(
51 | 			np.vstack([e.done for e in experiences if e is not None]).astype(np.uint8)
52 | 			).float().to(self.device)
53 | 
54 | 		return (states, actions, rewards, next_states, dones)
55 | 
56 | 
57 | 	def __len__(self):
58 | 		""" Return the current size of internal memory """
59 | 		return len(self.memory)


--------------------------------------------------------------------------------
/p3_collab_compet/Tennis.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Collaboration and Competition\n",
  8 |     "\n",
  9 |     "---\n",
 10 |     "\n",
 11 |     "You are welcome to use this coding environment to train your agent for the project.  Follow the instructions below to get started!\n",
 12 |     "\n",
 13 |     "### 1. Start the Environment\n",
 14 |     "\n",
 15 |     "Run the next code cell to install a few packages.  This line will take a few minutes to run!"
 16 |    ]
 17 |   },
 18 |   {
 19 |    "cell_type": "code",
 20 |    "execution_count": 1,
 21 |    "metadata": {},
 22 |    "outputs": [],
 23 |    "source": [
 24 |     "!pip -q install ./python"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "markdown",
 29 |    "metadata": {},
 30 |    "source": [
 31 |     "The environment is already saved in the Workspace and can be accessed at the file path provided below. "
 32 |    ]
 33 |   },
 34 |   {
 35 |    "cell_type": "code",
 36 |    "execution_count": 2,
 37 |    "metadata": {},
 38 |    "outputs": [
 39 |     {
 40 |      "name": "stderr",
 41 |      "output_type": "stream",
 42 |      "text": [
 43 |       "INFO:unityagents:\n",
 44 |       "'Academy' started successfully!\n",
 45 |       "Unity Academy name: Academy\n",
 46 |       "        Number of Brains: 1\n",
 47 |       "        Number of External Brains : 1\n",
 48 |       "        Lesson number : 0\n",
 49 |       "        Reset Parameters :\n",
 50 |       "\t\t\n",
 51 |       "Unity brain name: TennisBrain\n",
 52 |       "        Number of Visual Observations (per agent): 0\n",
 53 |       "        Vector Observation space type: continuous\n",
 54 |       "        Vector Observation space size (per agent): 8\n",
 55 |       "        Number of stacked Vector Observation: 3\n",
 56 |       "        Vector Action space type: continuous\n",
 57 |       "        Vector Action space size (per agent): 2\n",
 58 |       "        Vector Action descriptions: , \n"
 59 |      ]
 60 |     }
 61 |    ],
 62 |    "source": [
 63 |     "from unityagents import UnityEnvironment\n",
 64 |     "import numpy as np\n",
 65 |     "\n",
 66 |     "env = UnityEnvironment(file_name=\"/data/Tennis_Linux_NoVis/Tennis\")"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "markdown",
 71 |    "metadata": {},
 72 |    "source": [
 73 |     "Environments contain **_brains_** which are responsible for deciding the actions of their associated agents. Here we check for the first brain available, and set it as the default brain we will be controlling from Python."
 74 |    ]
 75 |   },
 76 |   {
 77 |    "cell_type": "code",
 78 |    "execution_count": 3,
 79 |    "metadata": {},
 80 |    "outputs": [],
 81 |    "source": [
 82 |     "# get the default brain\n",
 83 |     "brain_name = env.brain_names[0]\n",
 84 |     "brain = env.brains[brain_name]"
 85 |    ]
 86 |   },
 87 |   {
 88 |    "cell_type": "code",
 89 |    "execution_count": 4,
 90 |    "metadata": {},
 91 |    "outputs": [
 92 |     {
 93 |      "data": {
 94 |       "text/plain": [
 95 |        "('TennisBrain', <unityagents.brain.BrainParameters at 0x7fc1dc084c50>)"
 96 |       ]
 97 |      },
 98 |      "execution_count": 4,
 99 |      "metadata": {},
100 |      "output_type": "execute_result"
101 |     }
102 |    ],
103 |    "source": [
104 |     "brain_name, brain"
105 |    ]
106 |   },
107 |   {
108 |    "cell_type": "markdown",
109 |    "metadata": {},
110 |    "source": [
111 |     "### 2. Examine the State and Action Spaces\n",
112 |     "\n",
113 |     "Run the code cell below to print some information about the environment."
114 |    ]
115 |   },
116 |   {
117 |    "cell_type": "code",
118 |    "execution_count": 5,
119 |    "metadata": {},
120 |    "outputs": [
121 |     {
122 |      "name": "stdout",
123 |      "output_type": "stream",
124 |      "text": [
125 |       "Number of agents: 2\n",
126 |       "Size of each action: 2\n",
127 |       "There are 2 agents. Each observes a state with length: 24\n",
128 |       "The state for the first agent looks like: \n",
129 |       " [ 0.          0.          0.          0.          0.          0.          0.\n",
130 |       "  0.          0.          0.          0.          0.          0.          0.\n",
131 |       "  0.          0.         -6.65278625 -1.5        -0.          0.\n",
132 |       "  6.83172083  6.         -0.          0.        ]\n"
133 |      ]
134 |     }
135 |    ],
136 |    "source": [
137 |     "# reset the environment\n",
138 |     "env_info = env.reset(train_mode=True)[brain_name]\n",
139 |     "\n",
140 |     "# number of agents \n",
141 |     "num_agents = len(env_info.agents)\n",
142 |     "print('Number of agents:', num_agents)\n",
143 |     "\n",
144 |     "# size of each action\n",
145 |     "action_size = brain.vector_action_space_size\n",
146 |     "print('Size of each action:', action_size)\n",
147 |     "\n",
148 |     "# examine the state space \n",
149 |     "states = env_info.vector_observations\n",
150 |     "state_size = states.shape[1]\n",
151 |     "print('There are {} agents. Each observes a state with length: {}'.format(states.shape[0], state_size))\n",
152 |     "print('The state for the first agent looks like: \\n', states[0])"
153 |    ]
154 |   },
155 |   {
156 |    "cell_type": "markdown",
157 |    "metadata": {},
158 |    "source": [
159 |     "### 3. Take Random Actions in the Environment\n",
160 |     "\n",
161 |     "In the next code cell, you will learn how to use the Python API to control the agent and receive feedback from the environment.\n",
162 |     "\n",
163 |     "Note that **in this coding environment, you will not be able to watch the agents while they are training**, and you should set `train_mode=True` to restart the environment."
164 |    ]
165 |   },
166 |   {
167 |    "cell_type": "code",
168 |    "execution_count": 6,
169 |    "metadata": {
170 |     "scrolled": false
171 |    },
172 |    "outputs": [],
173 |    "source": [
174 |     "# for i in range(5):                                         # play game for 5 episodes\n",
175 |     "#     env_info = env.reset(train_mode=False)[brain_name]     # reset the environment    \n",
176 |     "#     states = env_info.vector_observations                  # get the current state (for each agent)\n",
177 |     "#     scores = np.zeros(num_agents)                          # initialize the score (for each agent)\n",
178 |     "#     while True:\n",
179 |     "#         actions = np.random.randn(num_agents, action_size) # select an action (for each agent)\n",
180 |     "#         actions = np.clip(actions, -1, 1)                  # all actions between -1 and 1\n",
181 |     "#         env_info = env.step(actions)[brain_name]           # send all actions to tne environment\n",
182 |     "#         next_states = env_info.vector_observations         # get next state (for each agent)\n",
183 |     "#         rewards = env_info.rewards                         # get reward (for each agent)\n",
184 |     "#         dones = env_info.local_done                        # see if episode finished\n",
185 |     "#         scores += env_info.rewards                         # update the score (for each agent)\n",
186 |     "#         states = next_states                               # roll over states to next time step\n",
187 |     "#         if np.any(dones):                                  # exit loop if episode finished\n",
188 |     "#             break\n",
189 |     "#     print('Total score (averaged over agents) this episode: {}'.format(np.mean(scores)))"
190 |    ]
191 |   },
192 |   {
193 |    "cell_type": "markdown",
194 |    "metadata": {},
195 |    "source": [
196 |     "When finished, you can close the environment."
197 |    ]
198 |   },
199 |   {
200 |    "cell_type": "code",
201 |    "execution_count": 7,
202 |    "metadata": {},
203 |    "outputs": [],
204 |    "source": [
205 |     "# env.close()"
206 |    ]
207 |   },
208 |   {
209 |    "cell_type": "markdown",
210 |    "metadata": {},
211 |    "source": [
212 |     "### 4. It's Your Turn!\n",
213 |     "\n",
214 |     "Now it's your turn to train your own agent to solve the environment!  A few **important notes**:\n",
215 |     "- When training the environment, set `train_mode=True`, so that the line for resetting the environment looks like the following:\n",
216 |     "```python\n",
217 |     "env_info = env.reset(train_mode=True)[brain_name]\n",
218 |     "```\n",
219 |     "- To structure your work, you're welcome to work directly in this Jupyter notebook, or you might like to start over with a new file!  You can see the list of files in the workspace by clicking on **_Jupyter_** in the top left corner of the notebook.\n",
220 |     "- In this coding environment, you will not be able to watch the agents while they are training.  However, **_after training the agents_**, you can download the saved model weights to watch the agents on your own machine! "
221 |    ]
222 |   },
223 |   {
224 |    "cell_type": "code",
225 |    "execution_count": 8,
226 |    "metadata": {},
227 |    "outputs": [],
228 |    "source": [
229 |     "## code to keep session awake in Udacity workspace\n",
230 |     "import signal\n",
231 |     "\n",
232 |     "from contextlib import contextmanager\n",
233 |     "\n",
234 |     "import requests\n",
235 |     "\n",
236 |     "\n",
237 |     "DELAY = INTERVAL = 4 * 60  # interval time in seconds\n",
238 |     "MIN_DELAY = MIN_INTERVAL = 2 * 60\n",
239 |     "KEEPALIVE_URL = \"https://nebula.udacity.com/api/v1/remote/keep-alive\"\n",
240 |     "TOKEN_URL = \"http://metadata.google.internal/computeMetadata/v1/instance/attributes/keep_alive_token\"\n",
241 |     "TOKEN_HEADERS = {\"Metadata-Flavor\":\"Google\"}\n",
242 |     "\n",
243 |     "\n",
244 |     "def _request_handler(headers):\n",
245 |     "    def _handler(signum, frame):\n",
246 |     "        requests.request(\"POST\", KEEPALIVE_URL, headers=headers)\n",
247 |     "    return _handler\n",
248 |     "\n",
249 |     "\n",
250 |     "@contextmanager\n",
251 |     "def active_session(delay=DELAY, interval=INTERVAL):\n",
252 |     "    \"\"\"\n",
253 |     "    Example:\n",
254 |     "\n",
255 |     "    from workspace_utils import active session\n",
256 |     "\n",
257 |     "    with active_session():\n",
258 |     "        # do long-running work here\n",
259 |     "    \"\"\"\n",
260 |     "    token = requests.request(\"GET\", TOKEN_URL, headers=TOKEN_HEADERS).text\n",
261 |     "    headers = {'Authorization': \"STAR \" + token}\n",
262 |     "    delay = max(delay, MIN_DELAY)\n",
263 |     "    interval = max(interval, MIN_INTERVAL)\n",
264 |     "    original_handler = signal.getsignal(signal.SIGALRM)\n",
265 |     "    try:\n",
266 |     "        signal.signal(signal.SIGALRM, _request_handler(headers))\n",
267 |     "        signal.setitimer(signal.ITIMER_REAL, delay, interval)\n",
268 |     "        yield\n",
269 |     "    finally:\n",
270 |     "        signal.signal(signal.SIGALRM, original_handler)\n",
271 |     "        signal.setitimer(signal.ITIMER_REAL, 0)\n",
272 |     "\n",
273 |     "\n",
274 |     "def keep_awake(iterable, delay=DELAY, interval=INTERVAL):\n",
275 |     "    \"\"\"\n",
276 |     "    Example:\n",
277 |     "\n",
278 |     "    from workspace_utils import keep_awake\n",
279 |     "\n",
280 |     "    for i in keep_awake(range(5)):\n",
281 |     "        # do iteration with lots of work here\n",
282 |     "    \"\"\"\n",
283 |     "    with active_session(delay, interval): yield from iterable"
284 |    ]
285 |   },
286 |   {
287 |    "cell_type": "code",
288 |    "execution_count": 9,
289 |    "metadata": {},
290 |    "outputs": [],
291 |    "source": [
292 |     "## Watch changes and reload automatically\n",
293 |     "% load_ext autoreload\n",
294 |     "% autoreload 2"
295 |    ]
296 |   },
297 |   {
298 |    "cell_type": "code",
299 |    "execution_count": 10,
300 |    "metadata": {},
301 |    "outputs": [],
302 |    "source": [
303 |     "import pdb\n",
304 |     "import json\n",
305 |     "import numpy as np \n",
306 |     "import torch \n",
307 |     "from collections import deque\n",
308 |     "from DDPGAgents import DDPGAgents\n",
309 |     "from utils import ensure_dir\n",
310 |     "import matplotlib.pyplot as plt\n",
311 |     "\n",
312 |     "import logging\n",
313 |     "logging.basicConfig(level=logging.INFO, format='')\n",
314 |     "\n",
315 |     "with open(\"config.json\", \"r\") as f: \n",
316 |     "    config = json.load(f)"
317 |    ]
318 |   },
319 |   {
320 |    "cell_type": "code",
321 |    "execution_count": 11,
322 |    "metadata": {},
323 |    "outputs": [
324 |     {
325 |      "name": "stderr",
326 |      "output_type": "stream",
327 |      "text": [
328 |       "INFO:DDPGAgents:Training on gpu\n",
329 |       "INFO:ReplayBuffer:Training on gpu\n"
330 |      ]
331 |     },
332 |     {
333 |      "name": "stdout",
334 |      "output_type": "stream",
335 |      "text": [
336 |       "Episode 100\tAverage Score: 0.000\n",
337 |       "Episode 200\tAverage Score: 0.000\n",
338 |       "Episode 300\tAverage Score: 0.000\n",
339 |       "Episode 400\tAverage Score: 0.000\n",
340 |       "Episode 500\tAverage Score: 0.000\n",
341 |       "Episode 600\tAverage Score: 0.002\n",
342 |       "Episode 700\tAverage Score: 0.000\n",
343 |       "Episode 800\tAverage Score: 0.000\n",
344 |       "Episode 900\tAverage Score: 0.000\n",
345 |       "Episode 1000\tAverage Score: 0.000\n",
346 |       "Episode 1100\tAverage Score: 0.011\n",
347 |       "Episode 1200\tAverage Score: 0.019\n",
348 |       "Episode 1300\tAverage Score: 0.006\n",
349 |       "Episode 1400\tAverage Score: 0.029\n",
350 |       "Episode 1500\tAverage Score: 0.060\n",
351 |       "Episode 1600\tAverage Score: 0.235\n",
352 |       "Episode 1700\tAverage Score: 0.133\n",
353 |       "Episode 1800\tAverage Score: 0.168\n",
354 |       "Episode 1900\tAverage Score: 0.324\n",
355 |       "Episode 1909\tAverage Score: 0.503\n",
356 |       "Environment solved in 1809 episodes!\tAverage Score: 0.503\n"
357 |      ]
358 |     }
359 |    ],
360 |    "source": [
361 |     "agent = DDPGAgents(state_size=24, action_size=2, config=config)\n",
362 |     "brain_name = env.brain_names[0]\n",
363 |     "\n",
364 |     "def ddpg(agent, \n",
365 |     "         brain_name, \n",
366 |     "         config, \n",
367 |     "         n_episodes=config[\"trainer\"][\"num_episodes\"]\n",
368 |     "         ):\n",
369 |     "    \"\"\" Deep Deterministic Policy Gradient \"\"\"\n",
370 |     "    \n",
371 |     "    # Set logger for this function\n",
372 |     "    logger = logging.getLogger(\"ddpg\")\n",
373 |     "    \n",
374 |     "    # number of agents\n",
375 |     "    num_agents = config[\"DDPG\"][\"num_agents\"]\n",
376 |     "    \n",
377 |     "    max_t = 1000\n",
378 |     "    \n",
379 |     "    flag = False # When environment is technically solved\n",
380 |     "    # Save path \n",
381 |     "    save_path = config[\"trainer\"][\"save_dir\"] + config[\"exp_name\"] + \"/\"\n",
382 |     "    ensure_dir(save_path)\n",
383 |     "    scores = [] # list containing scores from each episodes \n",
384 |     "    scores_window = deque(maxlen=100)\n",
385 |     "    \n",
386 |     "    for i_episode in keep_awake(range(1, n_episodes + 1)):\n",
387 |     "        # reset the environment\n",
388 |     "        env_info = env.reset(train_mode=True)[brain_name]\n",
389 |     "        \n",
390 |     "        # reset noise\n",
391 |     "        agent.reset()\n",
392 |     "        \n",
393 |     "        # get the current state\n",
394 |     "        state = env_info.vector_observations\n",
395 |     "\n",
396 |     "        # score of the agents\n",
397 |     "        score = np.zeros(num_agents)\n",
398 |     "        \n",
399 |     "        for t in range(max_t):\n",
400 |     "            # choose actions\n",
401 |     "            action = agent.act(state)\n",
402 |     "            # send the actions to the environment            \n",
403 |     "            env_info = env.step(action)[brain_name]\n",
404 |     "            # get the next state\n",
405 |     "            next_state = env_info.vector_observations\n",
406 |     "            # get the rewards\n",
407 |     "            rewards = env_info.rewards\n",
408 |     "            # see if episode has finished\n",
409 |     "            dones = env_info.local_done\n",
410 |     "            # step \n",
411 |     "            agent.step(state, action, rewards, next_state, dones)\n",
412 |     "            # accumulate rewards into score variable\n",
413 |     "            score += rewards\n",
414 |     "            # get next_state and set it to state\n",
415 |     "            state = next_state\n",
416 |     "            \n",
417 |     "            if any(dones): \n",
418 |     "                break\n",
419 |     "            \n",
420 |     "        # save most recent scores (mean amongst the agents)\n",
421 |     "        scores.append(np.max(score))\n",
422 |     "        scores_window.append(np.max(score))\n",
423 |     "        \n",
424 |     "        print('\\rEpisode {}\\tAverage Score: {:.3f}'.format(i_episode, np.mean(scores_window)), end=\"\")\n",
425 |     "        \n",
426 |     "        if (i_episode % 100 == 0):\n",
427 |     "            print(\"\\rEpisode {}\\tAverage Score: {:.3f}\".format(i_episode, \\\n",
428 |     "                                                              np.mean(scores_window)))\n",
429 |     "        \n",
430 |     "        # Save occasionnaly \n",
431 |     "        if (i_episode % config[\"trainer\"][\"save_freq\"] == 0):\n",
432 |     "            torch.save(agent.actor_local.state_dict(), save_path + \n",
433 |     "                       \"checkpoint_actor_\" + str(i_episode) + \".pth\")\n",
434 |     "            torch.save(agent.critic_local.state_dict(), save_path + \n",
435 |     "                       \"checkpoint_critic_\" + str(i_episode) + \".pth\")\n",
436 |     "        \n",
437 |     "        # Check if envionment solved \n",
438 |     "        if not flag:\n",
439 |     "            if (np.mean(scores_window) >= 0.5):\n",
440 |     "                print(\"\\nEnvironment solved in {:d} episodes!\\tAverage Score: {:.3f}\".format(\n",
441 |     "                i_episode-100, np.mean(scores_window)))\n",
442 |     "                # Save solved model \n",
443 |     "                torch.save(agent.actor_local.state_dict(), save_path + \n",
444 |     "                          \"checkpoint_actor_solved.pth\")\n",
445 |     "                torch.save(agent.critic_local.state_dict(), save_path + \n",
446 |     "                          \"checkpoint_critic_solved.pth\")\n",
447 |     "                flag = True\n",
448 |     "                \n",
449 |     "                break\n",
450 |     "                \n",
451 |     "    return scores\n",
452 |     "    \n",
453 |     "scores = ddpg(agent=agent, \n",
454 |     "              brain_name=brain_name, \n",
455 |     "              config=config)"
456 |    ]
457 |   },
458 |   {
459 |    "cell_type": "code",
460 |    "execution_count": 12,
461 |    "metadata": {},
462 |    "outputs": [],
463 |    "source": [
464 |     "env.close()"
465 |    ]
466 |   },
467 |   {
468 |    "cell_type": "code",
469 |    "execution_count": 13,
470 |    "metadata": {},
471 |    "outputs": [
472 |     {
473 |      "data": {
474 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAA7sAAAK9CAYAAADltHtfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzs3XmYbGldJ/jfm5n31kYVBVYBBVRRyKKoKEiBC7ZNj2gLqLRtj+jYOtJjMw+jYzsPzkzrdLuNC44KLiiCAw8uNKMsTaMUskOxCVRBFbUXRS3UrapbdWu7e9bNzHjnj8zIPBEZkXlORkTmWT4fnzIjI06c8544qc/9xu99fyflnAMAAADaZG6vBwAAAADTJuwCAADQOsIuAAAArSPsAgAA0DrCLgAAAK0j7AIAANA6wi4AAACtI+wCAADQOsIuAAAArbOw1wOo6rzzzssXX3zxXg8DAACAGbjiiivuyzmfP+l+Ghd2L7744rj88sv3ehgAAADMQErp9mnsxzRmAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAKB1hF0AAABaR9gFAACgdYRdAAAAWkfYBQAAoHWEXQAAAFpH2AUAAGiZf/jSXfGqv7sqbrvveNzxwIl41d9dFf/tyjv3eli7amGvBwAAAMB0/dx/+WJERHzy5kNx1v6FuOW+4/HOLxyIlz7rCXs8st0j7AIAALTUscXlWFrJez2MPWEaMwAAQEstzHc38nX3zAEAAFpu33yKnFV2AQAAaJGFue5Gvu6eOQAAQMstzKeB37tU5RV2AQAAWmphbjDsrvSEXQAAABqoGGiHG1QtC7sAAAA00dJKb/3xcGW3+FrbCbsAAAAtsjxQ2R0Mu8sduueusAsAANAiK4VAO9yN2TRmAAAAGmmpN34a83LPNGYAAAAaqDhVeWE+RUpp5GttJ+wCAAC0SLEJ1b6hbswaVAEAANBIAw2qNk1jVtkFAACggZaLtx6an4ucNwKuyi4AAACNtLSyRWXXml0AAACaqNhxeWF++NZDKrsAAAA0ULGyu2+osruksgsAAEATFdfszpnGDAAAQBsMd1wu/rpkGjMAAFA3f/WZ2+L6u4+U3v7TX7kv/o93XBUPHD81u0GxKx44fir+4AM3xkqJWwcVOy6/44oDcfjk0vrvr/ybK+LuwydnMsa6EXYBAKAhfuW/XRsv+qNPlN7+d993Q/zd5Qfii199cIajYjf853dfE3/ykZvjspsObbttL48PxItLvTi2uDzNodWWsAsAAC114tTKXg+BKVlcWr2WZSq7o/z+f/8t0xxOIwi7AAAAtI6wCwAAQOsIuwAA0HJbLOGkYXZ6KdP2m7SOsAsAANAi2325kTqSfIVdAACAhthpTu1KwC0SdgEAAGgdYRcAAKDlVHYBAIDW0Z+qPVzL8oRdAACAFtm++3Y3yrzCLgAAQEPsuEFVRwJukbALAADQctbsAgAAUFvW7JYn7AIAQMvl7RdxUnNdrMxOStgFAACouSrfV2y3aVeCs7ALAADQEDtuUNWVhFsg7AIAANA6Mwu7KaULU0ofTSldn1K6NqX0H0Zs84KU0uGU0pVr//3KrMYDAADQdFZfl7cww30vR8Srcs5fSCmdHRFXpJQ+mHO+bmi7T+Scf2CG4wAAgE4TkJqvg7OQJzazym7O+e6c8xfWHh+NiOsj4gmzOh4AAEBbVWpQNWLjNOZxm+3Kmt2U0sUR8eyI+OyIl78jpXRVSul9KaVv3I3xAAAANNHOG1RNdRiNMMtpzBERkVJ6RES8MyJ+Ied8ZOjlL0TEk3LOx1JKL46Id0fE00bs4xUR8YqIiIsuumjGIwYAAKDpZlrZTSnti9Wg+9ac87uGX885H8k5H1t7fGlE7EspnTdiuzfmnC/JOV9y/vnnz3LIAAAAtWX9dXmz7MacIuJNEXF9zvk1Y7Z53Np2kVJ63tp47p/VmAAAoIuqrPekniadhpw6s1J3wyynMT8/In4yIq5OKV259twvR8RFERE55z+PiH8TEa9MKS1HxMmI+LE8ajU1AABAh1VqULXN66kjC3hnFnZzzp+MbdZP55xfFxGvm9UYAAAA2kSDqvJ2pRszAAAAk9vpNNgOZl1hFwAAoO66WJmdlLALAACtpy0O3SPsAgAA1FylBlUjti1WhrtSJBZ2AQAAGmLnQbUrEXeDsAsAANAQJqSXJ+wCAEDLVZkCSz1pUFWdsAsAAEDrCLsAAAA1V606v3njgQZVHakSC7sAAAANsdOc2pF8O0DYBQAAaAjLr8sTdgEAoOUEpObrytTjaRJ2AQAAWi51MC0LuwAAADVXpUHVdtumjqzgFXYBAABarhvxdpCwCwAAUHMdnIU8MWEXAAAaIFe70erQe6c4EBqpi2FZ2AUAAKB1hF0AAICaq9SgapvXu1LlFXYBAABarisBt0jYBQAAqLlJw2pXbjdUJOwCAEADTNJkKm87sRXaR9gFAACgdYRdAACAmqvUoGrUtt2bxSzsAgAAtF0Hs66wCwAAUHdd7KY8KWEXAAAaYJIWU5M0t4KmEnYBAABaLnWwNCzsAgAAdEhXcq+wCwAA0CKj7qvckXw7QNgFAACgdYRdAABogDxBlyn9qejK1OUiYRcAAIDWEXYBAAA6pCudmYVdAACAFhk14z11sEWVsAsAAC03yXpf2qEjxdwBwi4AADSAuArVCLsAAAANoUpfnrALAADQcmnM4zYTdgEAABqiTCflkbXfriTcAmEXAACA1hF2AQCgASzVhGqEXQAAgIbQoKo8YRcAAKDlUmHRblfuuSvsAgAANESpBlUjqr9dCbhFwi4AALScma90kbALAAANkEffUAZK6WBhV9gFAABoCg2qyhN2AQAAaB1hFwAAoCHKNKja7n2pI5OahV0AAGg5633RjRkAAKglSzWJsGa3CmEXAACg9jpYmp2QsAsAAFB7k1V0uxiVhV0AAICGKNOgatRM5+LburJ+V9gFAICWs8yTLhJ2AQAAGkKDqvKEXQAAgNrryNzjKRJ2AQAAam/Sim73wrKwCwAA0BClGlSNCMYDDaqmOaAaE3YBAKABJlmqaZknXSTsAgAANIQGVeUJuwAAALU32eTjrkxdLhJ2AQAAak9FtyphFwAAWk5Mao9SDapGXPDUwQ5Vwi4AADTAqA67dM9O1+x2JN8OEHYBAABqr4txdTLCLgAAAK0j7AIAANTeZNPYSyz1bR1hFwAAWs69Wdtjxw2qCtOgU0emRAu7AADQAPIqEb64qELYBQAAqL1uVGOnSdgFAACgdYRdAACA2tOgqiphFwAAWs4qz/Yo1aBq231MZyx1J+wCAEADCKxEaFBVhbALAABQex0px06RsAsAANByXZm6XCTsAgAA1J7py1UJuwAA0HZyUqeMWtebCtOgu1LkFXYBAKABNCbquq5E1OkRdgEAAFrOml0AAABoAWEXAACg9kxjr0rYBQCAlsuCUqeMutrFacypI3OahV0AAGgAcbXrJguoqYMNroRdAAAAWkfYBQAAoHWEXQAAgJbryDLdAcIuAAC0XLbgt/PSmMdtJuwCAEADCKyU5m8lIoRdAAAAWkjYBQAAaDlrdgEAgNYxq5UuEnYBAKAJJFampCtVXmEXAACgRfLIb0Y6knALhF0AAICW60o1t0jYBQAAoHWEXQAAaDn36G0Pl7I8YRcAABpg9DpMKCcNPO7GnGZhFwAAoCHKxNRRlfzUwUW7Mwu7KaULU0ofTSldn1K6NqX0H0Zsk1JKf5xSujml9KWU0rfOajwAAAB0x8IM970cEa/KOX8hpXR2RFyRUvpgzvm6wjYvioinrf33bRHx+rWfAAAADDGZvbyZVXZzznfnnL+w9vhoRFwfEU8Y2uylEfFXedU/RcS5KaULZjUmAADoIut9m6+Ds5AntitrdlNKF0fEsyPis0MvPSEi7ij8fiA2B2IAAOg8HZW7bdLrn8b+0l4zD7sppUdExDsj4hdyzkeGXx7xlk2XMaX0ipTS5Smlyw8dOjSLYQIAANReqQZVo97XkYBbNNOwm1LaF6tB960553eN2ORARFxY+P2JEXHX8EY55zfmnC/JOV9y/vnnz2awAAAAtMYsuzGniHhTRFyfc37NmM3eExE/tdaV+dsj4nDO+e5ZjQkAAKDJzGYvb5bdmJ8fET8ZEVenlK5ce+6XI+KiiIic859HxKUR8eKIuDkiTkTEy2c4HgAA6CTrfZtv0mnIqSsLdQtmFnZzzp+MbaaU55xzRPzsrMYAAABtIa922zS/sOjK+t1d6cYMAADA5Eo1qBoRjLsScIuEXQAAAFpH2AUAAGgI09nLE3YBAKDlBKTm6+I05EkJuwAA0ABZS+VOm/TyF8NyV3KzsAsAANAQpRpUjajlpw6WhoVdAACAhlDfL0/YBQAAqLkOFmYnJuwCAEDbWe9LBwm7AADQAOJqt03coKr4uCNlYmEXAACgIUo1qBoRjDuSbwcIuwAAAA2hwl+esAsAAC0lGLVHFyuzkxJ2AQCg5YReUqkJ0O0i7AIAQAPspEFR9+JNe02zoXZX/i6EXQAAgIYo1aBq1Pu6knALhF0AAICGMCW9PGEXAABabppTYNkbk1ZmO1jYFXYBAABoH2EXAAAaIJvA2mlTbVDVkTKvsAsAANAmo5JxRwJukbALAABQc5Ov2e1e2hV2AQCg5bIOVXSQsAsAAE0gr0Ilwi4AAEDNTVqcL06D7sqUZmEXAACgRUbl4m7E20HCLgAAQM115XZB0yTsAgBAy1nuSxcJuwAA0AACK5NIHSwNC7sAAAAd0pXcK+wCAAC0yKjOzR3JtwOEXQAAAFpH2AUAgJab9B6tNF9Xpi4XCbsAANAAAitUI+wCAADQOsIuAABAi+QR0wBSB1tUCbsAAABt172sK+wCAEDbWe5LFwm7AADQAFlkhUqEXQAAgJYr3nqoK7chEnYBAABariP5doCwCwAA0CImvK8SdgEAoOVG3YoG2k7YBQCABpBXifB3UIWwCwAA0HKp0JUqdWQFr7ALAADQEDvtpNyNeDtI2AUAAGgRU51XCbsAAAAt15V76xYJuwAA0ACKdUSo2lYh7AIAAHRIV6q8wi4AAEBD7LxBVUcSboGwCwAALWfqa7e43KuEXQAAAFpH2AUAgAbIyrOEKn0Vwi4AAACtI+wCAAA0xI4bVBXe15VWVcIuAAC0XNayqFNMeV8l7AIAADSEHFuesAsAAA0g5HRbV6YeT5OwCwAAUHO+66hO2AUAAOiQtNMuVw0j7AIAQMuZAk0XCbsAAACN4ZuLsoRdAACAmuvGxOPpEnYBAABqTj23OmEXAACg5Yo9qbpSJRZ2AQCg5VQFu0VDslXCLgAANIAAQ4S/gyqEXQAAgJrrytTjaRJ2AQAAak5BtzphFwAAoOVSoTacOlImFnYBAKDlrPNsjzKXMqsDR4SwCwAAjSDAdFtHirFTJewCAADQOsIuAABAzanrVyfsAgBAy5kCTbEpVepIhyphFwAAoCHKNBvTkGyVsAsAAA0gwHRbN2qx0yXsAgAA0DrCLgAAQM0p7Fcn7AIAQMuZAt0eO2021sVp0MIuAABAzVUJq77bWCXsAgBAAwgwUI2wCwAAQOsIuwAAADWnsl+dsAsAANAQO202llL3WlQJuwAAADVXqUGVMnBECLsAANAIWYKBSoRdAAAAWkfYBQAAqDl1/eqEXQAAaDlToNtjp1eye+2phF0AAIDaq9SgSh04IoRdAABoBPEFqhF2AQAAaB1hFwAAoCGsvy5P2AUAgJaTj0gd7FAl7AIAALSILzdWCbsAANAAAgxUI+wCAADQOsIuAAAArSPsAgBAy5kBTepgh6qZhd2U0ptTSvemlK4Z8/oLUkqHU0pXrv33K7MaCwAANJ/IClUszHDfb4mI10XEX22xzSdyzj8wwzEAAADQQTOr7OacL4uIB2a1fwAAABhnr9fsfkdK6aqU0vtSSt+4x2MBAIBWctui9nAty5vlNObtfCEinpRzPpZSenFEvDsinjZqw5TSKyLiFRERF1100e6NEAAAgEbas8puzvlIzvnY2uNLI2JfSum8Mdu+Med8Sc75kvPPP39XxwkAAHWgogfV7FnYTSk9Lq31v04pPW9tLPfv1XgAAADaIPtmJCJmOI05pfS2iHhBRJyXUjoQEb8aEfsiInLOfx4R/yYiXplSWo6IkxHxY9lVAQAAGCu7BVVpMwu7Oecf3+b118XqrYkAAIAZEpDoor3uxgwAAABTJ+wCAEADqM1CNcIuAABAi+iEtErYBQAAaAhBtjxhFwAAWk5AoouEXQAAAFpH2AUAgAZQnYVqhF0AAICGKPOlh+9FVgm7AAAANZfSXo+geYRdAABoqTz0E7pE2AUAAKg5a7arE3YBAKABsvosVCLsAgAANESZrzxUgVcJuwAAADWnQVV1wi4AALRU7pf4lProIGEXAACg5nxfUV3psJtS+q6U0svXHp+fUnry7IYFAAAUCTtQTamwm1L61Yj4PyPil9ae2hcRfzOrQQEAALBZLvGth87dq8pWdn84In4oIo5HROSc74qIs2c1KAAAADZoUFVd2bB7Kq9+hZAjIlJKZ81uSAAAwDTkoZ/QJWXD7t+llN4QEeemlP59RHwoIv5idsMCAACgz5rt6hbKbJRz/v2U0vdGxJGI+LqI+JWc8wdnOjIAAGCdsEOEKn0V24bdlNJ8RLw/5/zCiBBwAQAAdlmVNbu+GFm17TTmnPNKRJxIKT1yF8YDAABMy1roEX7oolLTmCNiMSKuTil9MNY6MkdE5Jx/fiajAgAAgAmUDbvvXfsPAACAXaY6X13ZBlV/mVLaHxFPX3vqxpzz0uyGBQAAFGWtiYjQoaqCUmE3pfSCiPjLiLgtIlJEXJhS+h9zzpfNbmgAAABEVGxQNbthNErZacx/EBHfl3O+MSIipfT0iHhbRDxnVgMDAAAmk9d/ij90z7bdmNfs6wfdiIic800RsW82QwIAAIDJlK3sXp5SelNE/PXa7z8REVfMZkgAAMAwDYq6zfWvrmzYfWVE/GxE/Hysrtm9LCL+bFaDAgAAYDNT0ssrG3YXIuKPcs6viYhIKc1HxGkzGxUAAADrqjSoUgZeVXbN7ocj4ozC72dExIemPxwAAGBa8lrokX26p1I4bqmyYff0nPOx/i9rj8+czZAAAABgMmXD7vGU0rf2f0kpXRIRJ2czJAAAAEZRpS+v7JrdX4iIt6eU7orV23U9PiJeNrNRAQAAwAS2rOymlJ6bUnpczvnzEfH1EfG3EbEcEf8YEbfuwvgAAACoQPF31XbTmN8QEafWHn9HRPxyRPxpRDwYEW+c4bgAAIAJ5aGfdIf+VNtPY57POT+w9vhlEfHGnPM7I+KdKaUrZzs0AAAA2JntKrvzKaV+IP6eiPhI4bWy630BAIAJaUxEhCp9FdsF1rdFxMdTSvfFavflT0REpJSeGhGHZzw2AAAA2JEtw27O+bdSSh+OiAsi4gM5r3+fNBcR/+usBwcAAEA1ZgGs2nYqcs75n0Y8d9NshgMAAExLP/QIP92TUur8hd9uzS4AAAA0jrALAAANkLUmIjpfrK1E2AUAAKB1hF0AAIAWMQtglbALAAAt1Q89wk/3pL0eQA0IuwAAAA3hi4vyhF0AAGgAjYmgGmEXAACA1hF2AQCgpdarwarCdJCwCwAA0CKmvK8SdgEAABqibJBN2jELuwAA0ASKdVCNsAsAAEDrCLsAANBS/SmvqsJ0kbALAADQEGW+uPDlxiphFwAAoOZSVOs4VXX7NhJ2AQCgAbL7yXRaVq+tTNgFAACgdYRdAABoOVVhukjYBQAAaIoSX1z4bmOVsAsAAFBzlRtO6U8l7AIAQBMo1nWbBlXVCbsAAAC0jrALAAAt1W9MZQ0nXSTsAgAANESZ7y1MeV4l7AIAQAOoznZb1QZV+lMJuwAAALWnWludsAsAAEDrCLsAANBSeegnzWc6e3nCLgAAQM1VWrMrEEeEsAsAAA0hwVBe0qFK2AUAAKg7DaqqE3YBAABoHWEXAABaqt/MSFOj9sguZmnCLgAAQM1VaVAlDq8SdgEAoAEU9KiiUvfmlhJ2AQAAak6DquqEXQAAaKl+QBKU6CJhFwAAoCF8bVGesAsAAFBzlRpUWeAdEcIuAAA0gvhCFUl/KmEXAACg7qy7rk7YBQCAlurPZjWrtT1cy/KEXQAAgJpz39zqhF0AAIAWUf1dJewCAEADCDBUoQ4s7AIAANSeBlXVCbsAANBS4lH7uKblCbsAAAA1p0FVdcIuAABAi6j+rhJ2AQCgAbIOVVSQkkqwsAsAAFBzGlRVJ+wCAEBL9YvBqsLt4VqWJ+wCAADUnAZV1Qm7AAAALaL4u0rYBQCABpBfqEIdWNgFAACghWYWdlNKb04p3ZtSumbM6yml9McppZtTSl9KKX3rrMYCAADdlAv/G7pllpXdt0TE92/x+osi4mlr/70iIl4/w7EAAADQITMLuznnyyLigS02eWlE/FVe9U8RcW5K6YJZjQcAAKAL3JN31V6u2X1CRNxR+P3A2nMAAMAQHXapRIeqPQ27oz7+kf8nnFJ6RUrp8pTS5YcOHZrxsAAAAOrJlx7l7WXYPRARFxZ+f2JE3DVqw5zzG3POl+ScLzn//PN3ZXAAANB0/WAkINFFexl23xMRP7XWlfnbI+JwzvnuPRwPAAAALbEwqx2nlN4WES+IiPNSSgci4lcjYl9ERM75zyPi0oh4cUTcHBEnIuLlsxoLAAA0naZDlKWSv2pmYTfn/OPbvJ4j4mdndXwAAIC2Kfulh/5UezuNGQAAmKG8/lOpj+4RdgEAAGgdYRcAAIDWEXYBAKAJzESGSoRdAACAhtBpuTxhFwAAWiqvJSMBqXtS0o9Z2AUAAKB1hF0AAABaR9gFAIAGMBMZqhF2AQAAGqLMlx7ZIu2IEHYBAKC18tBPukN/KmEXAACAFhJ2AQAAaB1hFwAAGsAyTCL8HVQh7AIAANRclTW48vAqYRcAAFqqXwVsYzXw1vuOD3QdXlxaiTsfOrmHI5qtqtdQfyphFwAAaJgvfvXB+Be//7F4y6dvW3/uZ9/6hXj+qz+yd4OidoRdAACgUW6//0RERFx5x0Prz334hnv3ajjUlLALAAANkK3ELCW3cc52gb+D8oRdAACgNdqadSs1qGrpZ1CVsAsAAC21UeWUfpqucoOqKum4pYRdAACgNcR6+oRdAACgUbZat9r2NbuUJ+wCAEADyHBE+DuoQtgFAAAaJcX49ahtzYKVGlS19lOoRtgFAICWWm9PJfs0XuUGVbMZRqMIuwAAQGsI9vQJuwAAQKNs2aDKFF7WCLsAANAAIly3uW1udcIuAAC01VpCbtvU3i0bVLXsXHfCZ7BK2AUAABqli1OVKzeoUgkWdgEAAGgfYRcAAGiULk9jzm0/wSkSdgEAoAGEnG4zLbk6YRcAAFoqr//sTlDu0rmO4xNYJewCAACNsuV9diW9NUrBwi4AAEDNCfHVCbsAANAAXco69x97OA4eXhz7+pYNqmYxoBoZF3q3+8y6aGGvBwAAAFD0nN/8UERE3Pbql4x8vYvrcrdrULXdZ9ZFKrsAANBS/Q7OXZoCq2t1t673VoRdAACgNeS8VW5VJOwCAAA0hjBfnrALAABNIOWs27JBlc+JNcIuAADQKFs2qBJ2WSPsAgBAS+Whn3SFKx4h7AIAAC3SxdsSjaI/lbALAADQGNYklyfsAgBAA6hYliMM0ifsAgAArSHr0ifsAgBAS/WrnKqd3eJ6rxJ2AQCA1sgtT3plp7MnHaqEXQAAoJlG5bl2R12qEHYBAKABWl6w3BEfCVsRdgEAgNbwpQB9wi4AALRUf32n2xZ1i8C/StgFAABao+3BvmyQTSNXNHeLsAsAADTSyDjX7qxLBcIuAAA0gKmpm/lI2IqwCwAALbUekDuUCjt0qmxD2AUAgCk7dPTh+Or9J/Z6GJ006wr44ZNLcfO9R2d7kAlcc+fh1q9bLkvYBQCAKXvub30ovvv3PrrXw2AGfuT1n44XvuayPTv+djH2B/7kk3H34cWB5+Y62qtK2AUAAFpj1lXNm+89NtP9j1OlYv3giVORCgH32l///ukPqAGEXQAAaAATUylrpTf4+/6Fbsa+bp41AAB0QAf7U+laHRG9ng8hQtgFAAAaatRS1NbHvBJpfrnX23abLhB2AQCARmp9sN2hZZXdiBB2AQCAFsktncdcpfHW8koeWfXuGmEXAAAaoK0hjunrDf2tdDX4CrsAANBWa5mnS0G5Q6c6lmnMq4RdAACgkbpYsSwTY1eE3YgQdgEAgIYS6UZbHr7RbkcJuwAA0ACCXTltncZc5bxWejlS6mLde5CwCwAANNLo++y2NO1WMLxmt6u5V9gFAICW6gc/8a9brNldJewCAACt0dZpzH1lzk835lXCLgAA0EgiHVsRdgEAoAHaXrGclrZ+TG09r1kSdgEAgEYa2aBql74V2K3jTENXOzMLuwAA0FL9PNagXNYYe/WZ6jZdnrALAAC0xm5FQZGz/oRdAACgNXar4tqkacxdJewCAEAjCFddVjVbu9eusAsAALTKLjWo2pWj7NzBI4t7PYQ9J+wCAEBL5aGfbVGHGcR71qCqBufeFMIuAADQGru2Zrd1XyG0j7ALAAA0yla3jd21bsyybu0JuwAA0ADCVdeN/wPQGXo0YRcAAFqq6SFoJ+Nv+CnviMbLowm7AADQck0NvdsNey/Pas8aVI16rqHXd9aEXQAAoJZ2EuF2q3FUnRpUqeyOJuwCAAC1tF3FclSfql3rxlyjgNmr02BqRNgFAIAG6GKcGXfOdch2uz2EOpxz0wi7AADQUnnoZ9PsJOC1PRSOOj+V3dGEXQAAoJbGrYvdKtvt2prdGgVMa3ZHE3YBAIBaqlGe3KROQ6tT8K4TYRcAAGiUraJdNxtU7fUI6knYBQCABqhTuNot4865FpXMXR7Ixn7UAAAgAElEQVTC1gG/Bp9HDQm7AADQUrnhHarqdC/bYXs1tlHHlXVHE3YBAIBaGlvZ3cF72kw35tGEXQAAoJbGRrgaZLs65UtrdkcTdgEAgFrayVrUXbv10K4cpZw6T/feSzMNuyml708p3ZhSujml9B9HvP7TKaVDKaUr1/77mVmOBwAAdtM0Gwd1MdCMO+OtPovd68a8u9dj/XgjDlunKnOdLMxqxyml+Yj404j43og4EBGfTym9J+d83dCmf5tz/rlZjQMAAPZKzhEp7fUomhuU6xzi6jQ0a3ZHm2Vl93kRcXPO+Zac86mI+P8i4qUzPB4AANRKHSPIF7/6YNx37OG9HkY5Y289VO0tn/jyoVhcWpnKkMqMYbfVaSx1Msuw+4SIuKPw+4G154b9SErpSymld6SULhy1o5TSK1JKl6eULj906NAsxgoAAFNXx/uf/vCffTp+8E8+udfDKGVcRbrKp3rjwaPxk2/6XPzae66dzqBqSGV3tFmG3VETNoavwt9HxMU552+OiA9FxF+O2lHO+Y0550tyzpecf/75Ux4mAADMRl0jyN2HF/d6CKXsJMMNf8Fw+ORSRER85dCxaQxp4zg1urqy7mizDLsHIqJYqX1iRNxV3CDnfH/OuT+H4i8i4jkzHA8AAOyqaYaQLgaasQ2qKk5jnoldvh556OfAax382yhjlmH38xHxtJTSk1NK+yPixyLiPcUNUkoXFH79oYi4fobjAQCAXbWX1b9ihbOpYWjcNPA6VFX3fgQb6vB51NHMujHnnJdTSj8XEe+PiPmIeHPO+dqU0m9ExOU55/dExM+nlH4oIpYj4oGI+OlZjQcAAHZbU0NmXezk49u9Ww/tznHKqNNY6mRmYTciIud8aURcOvTcrxQe/1JE/NIsxwAAADTTuBC3dbjbneRXp2pqfUZSL7OcxgwAAJ021TW709tVY9QpUNZZHbt+14GwCwAAMyKsTWhcZXert7R0GnP/eKOCbc+f2UjCLgAAzMheFtyKx94qKNXZ2NHW4Dz2fgRF9RpNXQi7AAAwI3WLIE2rAO7oPrvTH8bo49QgcPfVaCi1IuwCAMCM1CkQRdRvPNsZNw28i9OYt1KjodSKsAsAADMyzRAyjaDalspunYLmbht17l3+PLYi7AIAwIzs6ZrdEc/1GpaKdnaf3WadY1lbnVXTrutuEXYBAGBWapJBmtoVelxwrUOgrcEQ1tVpLHUi7AIAwIzULWQ2rQJY6wZVNbq2dRpLnQi7AAAwI3XLlk1bszuOBlWD6jSWOhF2AQBgRuqWQeow/beKOjeo2qsh1ODUG0PYBQCAGdnLcFk8dv9h0yq7O5me29YpvVv9LTVtevpuEXYBAGBGahdBajegrY2t7G75plmMZMRhahQwazSUWhF2AQBgRuoWQppWAazzaOs0tjqNpU6EXQAAmJG6TaltXNjdwa2Hdq0bc40+yjpVmetE2AUAgAnde3Qx3vKpW+PB46cGX5hiBplGnim7i7seOhlX3P5A6f32ejne+6W7o9fLsdLLcenVd08lgO1kD5defXes7Mri5L0JmP2P9bO33B/3HFmMa+48HLccOr4nY6m7hb0eAAAANN1ff+b2+JOP3Bw5Il7+/CevP7+X9bY84nHZyu53/z8fjeVejtte/ZJS27/1c1+N//zua+J3/vUz4/jDy/Gb770+Xvuyb4kffvYTqw16yE7y8ls/+9V48nlnxc/8s6+d6Njb2eti6sve+E/x6LP2xwPDX7CwTtgFAIAJLS6tRETE0kpv4Pm9DkTDyo5nuWJl9N4jixERcejow3H45FJERNx/bBohbNw05q3fdfDw4hSOXX+C7tZMYwYAgAmllCJi86196rZmdzfC9zSPMb4b895/rns/ArYj7AIAwJQMTxOuW2V3NxpUTTOITmNPs2retFfXtg5BvymEXQAAmFBa+zkcgKYZS6YRcnYl7K4dol/tnsa+JjnGrM5Y6Kw/YRcAACa1lrmGuwDv5S1hiofuP97N4UwedbcPlGU+31kF/N2+tHWbJdAEwi4AAEwoRX/Nbr2nMe/Omt3Vg0yhsDv+GDPbuMJua3Zt2UzYBQCACfUrkL1dub/rzu3G1Nv+EaZS2d1mGnOpfUxhHDSTsAsAAJNaS1SbujHXLGntZhaf5Zrd3d7HyP3uUYzOeW+nxzeJsAsAABPqT19eGZ7GPM3OxFPY1e42qJrCvsbdZzf6U6W3P0hb1uzW5dhNIuwCAMCEeuuV3fqs2R0MiquPd2XNbj+ITmNf20xjLlPhbEsuLF7P3fjSog2EXQAAmFA/ewyv2a1bJNmN6a91y2Ftu89uRP3+rupK2AUAgAn1K22b1+zWK5bsxprd9UPswprdrt5nV2W3HGEXAAAm1A+1m+6zuxeD2cKudGPur9mdxr52ON489pd2kHXLEXYBAGBC/ewxy0ruVBpU9aoec29T1fg1u+XHpUFVdwm7AAAwofHTmPdgMCOOvb6muOKAqo5/dft+p+Rq7x25vy2PMz70phHbTttuX9rieZjGXI6wCwAAE+qH3OFbDzV9Du1OQtXGNOZprNmd/POb2ZrdPQyczf6r2j3CLgAATGhcpbFuBbjKld2K+y9Wc2da2V0/RokGVbOaxjyTvZY4bs4quyUJuwAAMKGmNKiq2o15J9Oep9qgagr32d2NDtS7LVdce91Vwi4AAExoN9bsTmNXVaucOxl/nuKa3Smd9RT2MWKve3qf3RYm+BkQdgEAYELrDaA2VXbrEUr6o6ha5awa6FKacsAfV9mt8LnOLpTu7rUdbFC1q4duLGEXAAAKLr367ji6uFTpPf3wMTztt35LKzcGdMXtD8aX7zm65dY7alC19nMaDaouvfrgxPvoj+fztz246cuIvpvvPRZvvOwrceLUcvn9bvPRXH3gcFx71+HS+yurlyPeecWBqe+3jYRdAABYc/O9x+J/eesX4hffflWl9/UrjSs1uvXQKMWs9yOv/3R872sv23L7nQw/b6Tdib35U7fGbfcdH3uMMg2qioH97VfcMXKb3/iH6+K3L70hPnXz/aXHtt1n84Ov+2S85I8/WXp/Zb3rCwfity69fur7bSNhFwAA1pw8tRIREXc+dLLS+8bdx3aa05in0VV4VGXzweOn4t4ji6O331Fld23NbuV3jra4vDLiGGs/S4yvuMkDxwcr9jfdczR6vRwPHj8VERFLK+U7P914cOuq+KwcP7X58+j7wP/23XHbq1+yi6OpN2EXAAAmtN6gqpcHAtieNjHKxcerv4wazrP/7w/G8377w6P3sZOuv3ngx54bN46rDxyO73vtZfGGy25Zf65KuP9P774mPnLDPROObrqm9QVDWwi7AAAwoWJlt25Tl4uq32d35yczq/vbru189ccE4+hX76+848H154ZvHbWdr9y7eYr1rJS5FtPpgN0ewi4AAEyoHyJXeoMBrG7Bt+p4dtL1d6edn9ffX2ZqcqX9jX5+bi0YFmcu1+16VSftFgm7AAAwoX5IynloGnNtJvOuqh52d7Bmt1913XHYne62467B/FraLV6vqpXdupmTdQcIuwAAMKGNbsx5ZpXdaeyq8jTmiSq7OxtxuanJW29b5hrMrc35XSlssNMx10WZ7tRdIuwCAMCEemtTYXt5qDHU3gxn7djFCvOq6mF3J5Xdnb93Wgabc43eZm6tDFos5jY+7O71AGpG2AUAgAkNdGMuhsyahaeqo5lkze5Oz7zcmt3+VOnR2/ZKVGv7U37zwLZlR7n7yvwpKewOEnYBAGBC4yqndctOVcP3TtYc50Lw34nhd40a8nanUbwO4zZdn8ZcGGfT1+wmtd0Bwi4AAKzZaUOpfsBb6eVSU2j3ShO6MW8ew/gdjXtlILSO2ahfBR0IxnW7YBWp7A4SdgEAYM1OA1r/fZuzUr06VFU9v51WZyMmmcY8PIYt9j3mIINZd+uRFPff+MqusDtA2AUAgDU77iDcr+zm+lR2R41jVxowTdigajicjhrzttOYC6F1XH7daCrWjDW7ZejGPEjYBQCANTutZPYKYXJUF+S62JX77K6d9c6/OBj8fWXLaczbN6ga9/b+fss0s6qDMiMTdQcJuwAAsGbn05gL3ZhrUtkdpWq1dUdrdsdO6d6ZUWPebmpyMSBvF4hXWnTroTmV3QHCLgAArJl0zWYvD1bg6tbwqOrpTXKf3WlNCV4ZsWZ3+FhbPT9um34Vv3iOWx1rp3bzb0DWHSTsAgDAmp0Gk/UqYS8P7GOaMWennaIn2UfZwNrPWDkX7oG7487Ww2MYde+h0dv2Fb+02K5j88rA+t7p35ppeRcXAsu6g4RdAABYs+NpzIVmR4OV3YmHtGN5xONZVXY39r/9WtmqRjaoqvCeceewsc668NwMgunyyi7+EUi7A4RdAABYs1UzpK0UmzINTKGtWYuqqpXrsluvr9MtPN5pcNzUjXmracwTNKjqb1P8TKoOOZVIl8tbnUAVJcZWZjxdIuwCAMCajQBU9X2Fn6NKqjUxq27M61OXC5XtnZ56mWnM24X2Yr4cnFa+ecrySm+jg/ZOv+zYym5Wdudk3QHCLgAArNlxNbJwG5s633qo6prUskXJgVsvFR7vxPC7RgXQ7To+D1R2xxynv1a3l/P6ec6imdSurtnVoWrAwl4PAAAA9sJbP3t7fPmeY/GEc8+Ilz//4vjby++IR5+5PyI2utq+/fI74pwz9kVExL75FP/d1z82IiIWl1bijZfdEk8+76z4wW95/HrouuXQ8fjdf7xh/RhX3vFQ3HjwaJxcWomvOWt/3H14MV76rMfHlw4cjhc983HxzivujOc9+VFxxwMno5dzXHH7g/HT33lxPOac0+MdVxyIG+4+Ei965gWxbz7F+645uL7fP/3ozfHos/bHVx84sf7cT33HkyJFird97qvxo8+9MB65Nu6+U8u9eP3HvjL288g5R0opPvnl+zaeixw333ss7nroZHz308+Pm+45GvceeTi+62nnxQ0Hj8QDx07Fdz71vKGKeD9Ejj5Or5fjzZ+6NfYvzMVPfvuT4tjDy/HeL90dL3vuhSPDWi5MN/7bz98RP/Ssx6+/9rGbDsXN9x6Lz936wMB7PnDdPXHdXUfi8MmlgXHc8cCJ+MMP3RTPuOCcePcX71wfc3/8Bx48Gb///hvj4vPOijsfPBkvfubj4gPX3RMnTi3HUx/ziHjcOWcMHOfdV94Z3/a1j46llV4cPPxwXHLxoyJFxOdu2xjP7/7jDfG93/DYePwjz4hnPvGRq+O79mA844Jz4sJHnxlfvf9EvOuLB2JxqRenlnvx/Kd+TXzPMx676TMr7nMcUXeQsAsAQOccXVyK/+u/XrP++33HH443fPyW+JYLz11/7ksHHor//R1fGnjfba9+SURE/MEHboy/+MStERHxgq87f6CS+LbP3bH++O+vuituOHh0YB9/9OEvR0TEx258fLz7yrs2je3ztz0Qf/zjz45ffPtVERHxvmsOxrMuPDc+duOh9W1+7/03RsTqtNV+mLv8tgfihc94bPzRh78cRxaX4t89/8nr2/d6Oa6+83B8+d5jYz+TnFdD/r9902cHnnvhaz6+fu7f99rL1h9//x9+YuMzKUzjXu9uPKZK+o/XHozffO/1ERHxlPMfEW/65K3xkRvujW990qPi6Y89e1N1tX87oE/dfH/8x3ddHV+683CcsW8+IlYDfH98w178x6vj+08vecb6c8Vrs77/nNev339dC8B9r/3QTSP33XftXUfih173qfXfn/qYR8TCXBq45u+44kC844oDEbHx9/OKv74izj59Ia7+tX8ZL3ztx+PU8kYJ/c2fujW+/Fsvin3zG5Nw/+Hqu7ccR5/C7iDTmAEA6JxiuIiIOHT04YiIeOD46s+l5RyX3/bgpvfd9dDJiIi479ip9ed6vfFVzFNb3Lj1M7fcP/L5Gw8eHRjfwSOL8fByL77hgnPie77+MQPb/sILnx6PPee0iFgNmf33XXfXkbj2riMREbF/YS6uuetwXHH7amXwf/quJ8coo6YLl52KXGz2tNzburJ7dHFp/fHxh5fXK8n99abDb+vv+9jDyxERcd/atZqWXs5TuyfwnQ+ejDsfPFlq26OLq+cz/LcYEfHRG+6NiIiDhxfj4OHFgc+s6K0/823xyhc8Zf13DaoGCbsAAHTO8DrKfhOh/trNG+85Gr/xD9dtet93vvojm55bGbrd0LdetFEdXjy1MnYM9xwZHdr2L2z+J/pyrxf75tOmcS/Mp/iJb3tSREQ8/bFnx9La67fffyK+cmi1ivuNjz8nHjqxFL996er06tNG7D9iXCOoscMfeu/a9hGxtBbwywbl/hcCS2MaOY1aR11lae122/Z6s7nl0PjjbX+sP1ubbv7tv/Ph+Pbf+fDY7VIabEqVpLsBPg4AADpnaaji2r89zE6aKi2v9Aam3j7xUWfG1b/2ffG8Jz86FkdU7bbyDRecEymlTQFteSXHwvzcptvY7Jubi5/7F0+NiIizTluI5bXzOrXSi4fXjn326YNrd0eF6YjRzahKd2Mu3G6o/8VB1U9y/X2bujGPOF6FvW93DrkwjXk3LJXo+rVSMnyntf/Z+J0iYRcAgM4Zvh1Mv6pYNmQMvLc3GJYW5lKcffq+2D8/Fye3qOyO8thzTovlld6mKcVLK72Yn0ubqp/zcynm5lI88ox9sbzSW6/8Lq30YnlltRq8b+h+NKctzI889uhpzOXG3T//Xo716nLVALkeAofeNuntgLZ7dy/P5pZD45S5FdHwlzHjzA1Xdi3aHSDsAgDQOcMV0uX1qbc72NdKb6AauTCf1n8uLlcLu2fuX4illbwpEC338uo05qEQtG/tWPvmUyz18npIWlrpxdJKL/bNz62Pp29sZXdk4Ctb2e1Xc/P6GKvmx42K8OAbR90OaJrTmFcKtx7aDVMNu3NpIOCKuoOEXQAAOme4Qrq8w2pkf1+9gbC7+k/shbm5yoHvjP3z60F1YHwrvViYmxuxZnfjWMsrvfUgtbySY2klx8Jc2tS0aOya3RFJv/jUVmtN+6/kXJyOXO3kh4N8306q7UXbTXmu4zTmsvfmTTHYgVlhd5CwCwBA52yexjzBmt3e4JrdhbmNamtVZ66F3eGws7TSr+wOhd25jSry8kper1gv93KcWumNrOKOr+xuDrTFj6MY0oa325jGnNe3q5pR+9Ofy6zZrWLbym5vd8NuqcpuybXeKaWYKyTcOWl3gPvsAgDQOcPVteUJ1uwur+SBQNWPGwvz1etKZ+yfj17efDua5V6/sjs8jXlu/edSL8dcIUgtnlpZrS4PVTbHVXZXennT51IMgcWQVtwu543zH6zsbnmqa/vfeLw+/Xl4m/WNNl6pUjXebtvVewOX3t3EykxRXir5dzi8ZpdBwi4AAJ2zqbI7pqpYxtJKb2RlcLgxVBln7lv95/nJpcG1vqvdmEdUdvvrg+dW1/MWi8knTq3EvoXNY9jq1kPD+x8XdgeCb2Ea9+o+ylfJi+F97K2Hct7y9e1sly17OVeecj2JMlOUy67ZTWloza7gO0DYBQCgc4bXh/Z/31Fld8w02OHGUGWcsX81iJ48tbz+XF6bGrxavR0c98Lc2prd+blYWskxN7cxjhNLK7FvxLrhcd2YR4XdYpm1eOyB4FuYxl3sxjwuQA5Wcwf3M+p9/U7JxWBc5SqdWtm6SVivl2NlbhfDbokgO3wdxmXxuTS0ZleLqgHCLgAAnTM8TbQfLnbWoKo3MnztZBrz/rX3jKzszm2u7Ba7MS/3ejFfeNvJU8vr05wHjlFpGvPgGPqWhiqyGx9bHjsdeWM/xfduDtCbpjHnjeOsH6XCZdquIjxqrfIslalQD1d2x30JM7dpze5kY2sbYRcAgM4Zrq6dqjD1drjyOLxmt28n05j7AfnE0P15l1ZyLMxv1Y15NQjPp0Jl99RKLMynTeFx3DTm4nrbvuLnMSqYrj7emMbd623/xUEx7J0aE3wHxtDb6DC9E8Prn4et5BzzefdS4vC661GGP4utpjUXR+4+u4N0YwYAoHOGq2sn18JlmQLfcOBc7o1esztRZXco7C73eqPvs7vejXlu9ZZFhbGdPLVSvbI7tP/iWRXPeyD49gbX7G7XjbkY9ornuTxm3XRv5DTm8sF3u/WvvV3uxlymsrvaNKs4xbtcZVfUHSTsAgDQOcPVtf604TJrdjfftmh0ZXcna3b77ymGwBz9acxzIxpU9bsxp1ju5YEwfOLUSuybT5vGNjbs5rwpVA02qBoMuH2r07g3piBv1425GPYGwu769OehNbu9yRpUbVfZXe7lie/lW0WZNbsREQ8vb3w2485hbs59drdiGjMAAJ0zHBqH18hu5dSm5lajK4P75qrXldanMRfGk/PqMffNp80Nqta7Mc/F8spyzBfSzvFTy+sNrIrmxySiPCLs5oFpzGOCbyHs9wr7GNegqvjZF89zff/DPbLy5mNWW7NbfdrwrPR6mz/jcYpT2YentfelGO7GLO0WCbsAAHTOcLjZrvo3/N5ikFvtRrx5u51UdvePqOz2xzfq1kP9QL1vPq11Yx6cIrxvYS6G0+O4QLTS21zZLp7XuNsEFadxF58fNzV4/DTm0ddgoxtzoUHVyC1HKztteDcs9Xqlg3Xxs1kc82WM++xuzTRmAAA6p2x1beR7V/LQNN7R92kdtV52O/1K7HDY7b+2uUFVobLb68XySl6fyrrcy7FvbvM05vkx6WillzcFzuLhhgNu8fn+MU4tD1akRxk3jbn//OZuzP0gvXWX53GGK/Hb2beDLynKWl4ZcXunMQYru8sjt0lDa3YZJOwCANA5ZddNjrK00huogK70eiMrgwsjQuV2QWrfwuhbD417b/+5hf6a3V6OM/bNF17f/M/9cZXAXs6bqqDj1uwWK+HFadyDld3Rxynup3ieK2MaVPV/XylMj95qGvPp+wbPueoU5dP3jb4P8TQsV5rGvBFwTy6NPoeUrNPdirALAEDn7LTZ0ep7ewPvX1rJI7sDj+rGvF2Q6ndXHrVGc9T+FtanMc+tVQ17ceb++cJ7Nt96aFwlsJfzpi8BiqGyWCFdLISvpcI07mIIHrtmtxD2ToxoUDVsuEHV6s/x12/4M64ads+YIOymFFu2RF5e6ZW69VDEYNV7VKU/YvVaWqc7nrALAEDnlA0coyytDE73Xb3P7ObtRlVitw27a5XdUWs0R1WKN6Yxp7UglQeOsX9UZXdMabeXN0/vLlawiwG3WJFdLoT9gbA78iiD4bN4nv3bJg1/cbB+66G192137U5fGAq7y9W+2Dhj/4wruzuYxmzN7s5oUAXUXq+X4y8/c1v8yHOeGOecvm+vhwOdceUdD8VDJ07FC77uMWO3OXlqJd72ua/GT3/nxWP/AV3c31fuPRbHHl6On/qOJ0VKKW6+91jcfO/R+P5vumDT9kcWl+I1H7gpIiKOPbwcT3r0mXHumfviRc+8II4tLsc1dx2O+ZTi1vuPxyv/+VPiYzceisecc1p84+MfGREb/7/jZc+9MM7c351/8rz3S3fH5bc/EK98wVPiMWefvun1e44sxps/dWssnlqJZz7x3OjlHNfffWTTtNCFuRT/7rueHFfd8VCcWunFOWfsixMPr8Q3PP6cOGv/fLzpU7fGj15yYTzl/EdsO6Y7HzoZn/nK/XH45FL8xLddVGma6DuvOBDf+dSviQseeUbp90RE3HjwaLzriwfiB7/58fH3X7or/ofnXRRP+pqz4saDR+OGg0fi//3ErZX2V/SnH705bjp4dP33S68+GEcXlzZtN6oT8nZVw36gvemeo5teGzUluf/cwvxcPHhiKRbmluOCczeu+8J82lRhHdeN+S8uu2XT+tbfed/1649/+9KNx3/zT7evP/75t30xTlubOnzLfcfXn7/qjofi195z7abjfP62B9YfF8/zspsOxclTK3H84cH1qR+98VDcfv+JuGHtM7/53mNx6OjDI88hItbH0lccUxmTVHbHdU3u+/333xgHjyyu/z7q8+n7m89ufMaj/h4i1roxu7vuWN35//xAY116zd3x639/XRw8vBi/9OJn7PVwoDP+1Z9+KiIibnv1S8Zu83vvvzHe/Klb47HnnB4v+ebNgXXU/iIinvOkR8U3PeGR8cLXfHzsMV7/sa/EWz5926bn//byO+Lme48NVJle+IzHxsvf8vmBfX3w+nvi1//+urj1vuPxGy/9pi3H1iY/+1++EBERBw8vxuv/7XM2vf7+aw/GGz5+y9pvG/+YPuf0jX8W5og4urgcj3vk6fGb771+4P3zcyl+46XfGG/4+C1x5ORy/M6/fua2Y/rRP/9M3PnQyYiIeOjEqXjV931dqXM5cWo5XvX2q+LJ550VH/3FF5R6T99bPn1bvO1zX41/uOru1WPniF968TPiX/7hZQPbzaWIx5x9epw4tRxHFpfjnNMX4sjicpy2MBcPj+nQ/LEb7x34/YaDR+K0hfn49//sa+P1H/tK/NjzLoqIiEeesfkL2ude/Oi49+jiwN9v37961uPjwkefGU981Blx5ORqeD5j33ycddpCROT4usedHb/2g98Qb/n0bfHsix4VVx14KM49c/UYz7rwkXHp1XdHzjle/MwL4uTSgThycjm+5cJz4+mPPTsu+/J98agz98V3PuW8OPfMffG15521KQR+4suHNo3pwIMn1x/fWtj+i199cP3xwSOL6+tkTy2vxHmPOC3OPn0h7jv2cLzrCwdGfoZ9i0sr8bhzTo/9C3NxxwMn4o4HTkRExHmPOC2OP7wcJ5dW4rq7DsdVdzy0fn/gk6dWxk7rjYj45Rc/I/7nv75i/fdi06ytPO0xj4g7HjwRL37mBfHQiaU4c/98HDyyGCdOrcSPXvLE+PxtDw58BqM84rSF/7+9O4+SqyzzOP57ujvd2feFEBISCKsKJCzCRJsiwF4AABgISURBVBEFIYCCIwrBBVzOMHjgqMPRMWEUQWZUdAR1HJWZEQQHQWdAyBlQQEA4IEtCgIQQAg0kITuBkJ30Us/8UfdW36q+VV3VtVd/P+fU6Vtv3ap6q96+t+5zn/d9r0zSjr1dGj+8VefM3k/XP/Jq6vF7l29MLbe1NOmOJWvVZOnjmy868QDd9ew6LVnd8x2HmV0zaeKINn3o0El6tP0NjRveqoMn9T7hNO/YqXprV4eOnDpa/7d0Q6r8o0fuG9tDoFER7AKoec+9/rYkBT/2ACohkecEKm/v7pAkPfXam30Gu1G3L1mrd08ZlVb25s69emtXhw6aNEJS8qAxzvPrtvcqu/+FTWn32zfv0Jo3dwd17J1xy2X5+m2aOnZoqidJIuFavHqrjpsxtqDXqbamJtNDQVA2fdwwzRg/TFL8JXYmjxqsxxecnLr/Tme3Dv3Wn/TKGzt7rdud8NRr7M0ziIhmsrbtyb89wi61myLPz1eYvAyD7GxjdF/57hklHfMYDeTfte/I1HJLU3ICqctOPVg/OvfIVPmSNVv18Z//VYdNHqkfz5slSXr0Gx/K+vrHHzBOn5szo1f5ecdO03nHTkvd/+opB6c9/sp3z0i7/+DXTtKXb31GC59brx+cc4TOPXZq2uPurhkL7kk998DLk8uvfS/9+zrrZ49q6dpt+sm8WTrtXftkrXcxFtyxTLc+tUb/cMrB+tJJB6bKp8+/O1WnsK6Z9ct1sk6SfvrAy7r2/pd00YkH6PLICfUvn3xQ6erfjxP1lxfwnGOm9943ff+cI1LLl3xwZmr5386fVXBd6hljdgHUvDWRs7wAKmPLruxdBOPc9Phq7dwbf2mMUDTLddtTr/d6/JRrH9aHr+vJuo0d1pr3+//w3pUZr/WI/uWeFVnWzs7ddeZPH9XnbngqVXb9I6/q3Osf12PtWwp+vWq6e+kGff7GRfr8jYv0wX/9S6o8bibYzJl/w8zPrTHtJPW+Fmu5dBcxiVS+zy3n5D7Txg5VW0uTxg1r1WWnJoPPzJM4h+2TDIjnlilQzOVvDhwnSZpz0Phej5mZZk4crv3HDVVzk2na2KGaOXF4r+/rU0EW+4j9RvV6jVKZ++7kdzNn5ri08hnjh2n00EEyM00ZPUSTRw1Oq9+Z7+n7BNx7gnp/pICTdbWo3utfLqRJANS8cPhQdxGTiQAozN4sl7noJXLc29mVkHKckzps8ght3r5X7z1grG5b1DuI2pqRgS3mOqhpVSwglgnfc8mat1NlL29OjpVb//ae2OfUm+iMt++eMlLPr9veK5Of7TqsoTBLmu9YQcuy3JfOYL/fn3C0swZ+M5qaTIu/eYrMTMNam/WFOTN6jVce0tqsF6+eq7aWyuegzjt2qj42a0rWMdR//Mr7U2O5/3zZB2K3pfOOnaqzj5pS1kmdPnDwBK34ztxe7/Hnyz6QOvHy8NdPSpvW6qV/Pr3P/2NJ+uAhE/Xi1XPLermhSvjJvFm69tyjql2NmkNmF0DdKOYyEQAKU+ilOiSpO9eFL5WcsXWfUYM1YXib3NMvSxLXbbqY66D2V76zpNaqfLqfR/elU8cMjV3HzHJeD7ZSbVNMe2Q+t1onTEcMHpQcx2mWNaAaPKi5KpePyVUnKTn5VThOtrWlKXaCLDMra6AbinuP5iZL1a+lOb1+rS1NeQW7Unmvq1sp0e8CPfhGANSB4JIDNXCWHhgo+pNV7Ssw6Uy4WpqbUtcKjXaFjcvCVSPwzJyJtt7kk82M7kvDWWfjvum4mYR73if+EjGlVlSwm/FddFao6zWA2kGwC6BukNkFKiduEqO+9JUN7upOaFCTpbIv0W06LqipRjfUamSTSymf4DC6Tq6MXEsemd1yn5Ao5n8g8zej3tsWQOEIdgHUvLCnY713LwTqSX8yu30Hu66W5p7usdFAJrp9hxnf/m7zmdcULUS9n1TL5zuLfsbU9URjnhbXZTX1PonK9LgprhtzIuN+fbctgMIR7AKoeeE4QLoxA5XTnzG7fQXInYlE2ri6zkj2OBr4dqayhvnVIby+Z6iYmYL787lrSaHdmIe25urGnD2zG35P5T45UEx7ZP4/0o0ZGHgIdgHUvPBsfL1nXIB6Eg0y8r3mbl9dn7u6XYOaLNU9NhqMRLNuqUAq4TknSQoNbU2/uEQxszjXe7BbaGZ3cBjsxmTDc2Z2u8Pse5kzuyVsS7oxAwMPwS6AmtdRYJYHQPGiAVG+4yb7Cky6utMzu9HgOBqYdEYCqVyTJIWGZMykWswkU6W63FG15ArWE4neAerQHLPQ5hyzG/xPlPv7Kma/nxn4c8IUGHgIdgHUvNREKHV+EArUk2iQke9Yx76yomGmdlBcZjcty9vTRTZXwBUamjHJUjFjM3Nlp4sYClwxufaTnTEB6pB+d2MOe9yU9yRkMQFqZrdlhsIAAw/BLoCaV6mDKgA9OssQ7IaZ2p7ZmBNpj4VSvTkSiZxdaUOZwW4px3lG1cNliXJlQuP2pUMyuoBH5cqqV2o25mICVCaoAkCwC6DmdVbooApAj/50Y+4rCxfOxhwGUXFdl8P1UuvnyC6GBg8qXbCb67n1MJQiVxvEBaip6+zGPM1yfPVh1rTckz4VNxtzZjfm2m8/AKVFsAug5vVMVsOBClAp/cns9hUMdgaZ2taW4NJD0QA3Zjbmzm7PK7Pb2pK+TlFdX/PIjNayXJnQaMY8lJkVj8rVbbsncC53N+YiTlxkfBcMhQEGHoJdADWvq8hrbgIoXNzsyH3J6zq7TZYKYLuyZHZTE1QlEnmN2c3M/hYTgOUKaOvhhFvuzG7vme3DrLjHjtrNLpp9L1R3AYOfiwlQM+tWD5l5AKVFsAug5oXX4mRyEaByouNTcwYckYdyBVrurq6Eq6W5KdWNuSPLmN3o0IV8ujE3Z6yTOba2kIml4gIiU5CJ7qr9E265x+wWltnNpacbcz+ux1xAgFzU+OuM59ZDZh5AaRHsAqh5qYMqDlSAiunKEohm6kzEd0Xu9XrBeoOaLNWNuSutG3Pv1+nszm+Cql7vlbGvKCSTGLefSXhPprnWpb7nmIx4Z0xmN9eY3VzZ3mImqCpkX15Mj57OhCt6HqQe2g9AaZU12DWzuWa20szazWx+zONtZva74PEnzWx6OesDoD51VmhsGIAecd2K46RlZHNkPsOgJZrZ7YzJ5kpSR1cYXOZ36aFMmdnAQvYdcZnEMFNcD7Mxh/XPnLQr+lj0+xjSz8xuTzfmfmR2Cwg6i52NOXoNZobCAANP2YJdM2uW9O+STpd0uKTzzezwjNW+KGmru8+UdJ2ka8pVHwD1K3VQxeQiQMVEu6fmCjjynbU5fCx5nd0w2O09A3P0/TqDSxUVak9nd9r9QoKcuM9aqcvslEJYxyExwW7cvrQtmNwr7pPlSoiH7dmf2ZgL68Zc3Jjd6KWV6mHMNYDSKmdm9zhJ7e7+qrt3SLpN0tkZ65wt6aZg+X8lnWyWa6J7AANRmE3hshFA5USztDkzu9FZlLtyXbYnyOw2WaqLbdqMzzGzMXd1e2x33L7s6UgPdgsJyOKy0/V0re/we4zL2PbsS3s+Y0s/uolLxWV2C/kei5psLJFIG5Pc1e3yQgZwA6h72a8kXrwpkl6P3F8r6b3Z1nH3LjPbJmmcpC1lrFdZffPOZeom+wSUVEdwAN2+eZcW3LG0yrUBBoala7elln/xl3ZNGNEWu96LG3aklu96br1WbtoRu947ncntuKW5KZXZveXJ1anHf/NEz/KNj63S/S9sUvsbOzVzwvCC6/7rv65Ku79y4/a89x3tm3emlsPnrNiwXZL0WPuWmt8HrXlrt6T4zO7PH2rXxJFtWv/2nlRZagKwAg9dXt+afJ+de7vy+k6i2eTl6/Nvjxc3Jv+fdnV0F/zd7+1K78a8dXeH5t++rKDXAFDfyhnsxp2KzdyV5rOOzOwiSRdJ0rRp04qvWRn9ZeUbqQNzAKUxedRgTRzRpg3b3tEDKzZXuzrAgHHwpOHq6va0wDeXdVv3aN3WPVkfnzJ6iA7fd6QmjRyswyaP1Ktv7NKIwS3a8U6XXt60U8PbWrRzb5dWbtyhlRt3yCS994CxmjNznJau3aaES6ve3KVEwrV1d4e27u7UZ4/fX0fvP0Z/DvYNzU2WClhbW5rU0ZWQuwred4wb1trrOTve6aqLfdAhk0boo0fuqzd2vKY3d3Wkypet2yatS3ZdPnDCMI0b1qbW5iYdus8IXfqhmb1e57Mn7K8r7lreq3zUkEFqbW7SkVNHa1Oe++Xxw1u1ZWeyLp3diYK/x/HD2wp+zqQRg/WJo/fTb55YrTFDB2nj9nf00MrNGtrarN1B9v+Kj2SOsAPQSKxc3TnM7ARJV7r7acH9BZLk7t+LrHNvsM7jZtYiaaOkCZ6jUsccc4wvXry4LHUGAAAAAFSXmT3t7scU+zrlHLO7SNJBZjbDzFolzZO0MGOdhZIuDJY/IenBXIEuAAAAAAD5KFs35mAM7qWS7pXULOkGd19uZt+RtNjdF0r6laTfmFm7pLeUDIgBAAAAAChKOcfsyt3vkXRPRtkVkeV3JH2ynHUAAAAAAAw85ezGDAAAAABAVRDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4RDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4RDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4RDsAgAAAAAaDsEuAAAAAKDhEOwCAAAAABoOwS4AAAAAoOEQ7AIAAAAAGg7BLgAAAACg4Zi7V7sOBTGzNyStrnY9+jBe0pZqVwJ9op3qA+1UP2ir+kA71Q/aqj7QTvWBdqof4yUNc/cJxb5Q3QW79cDMFrv7MdWuB3KjneoD7VQ/aKv6QDvVD9qqPtBO9YF2qh+lbCu6MQMAAAAAGg7BLgAAAACg4RDslsd/VLsCyAvtVB9op/pBW9UH2ql+0Fb1gXaqD7RT/ShZWzFmFwAAAADQcMjsAgAAAAAaDsFuCZnZXDNbaWbtZja/2vUZyMxsqpk9ZGYrzGy5mX0lKL/SzNaZ2bPB7YzIcxYEbbfSzE6rXu0HHjNbZWbLgjZZHJSNNbP7zezl4O+YoNzM7KdBWy01s9nVrf3AYGaHRLabZ81su5l9lW2qNpjZDWa22cyej5QVvA2Z2YXB+i+b2YXV+CyNLEs7/dDMXgza4g9mNjoon25meyLb1i8jzzk62Ge2B21p1fg8jSpLOxW8r+O4sPyytNXvIu20ysyeDcrZpqokx3F5+X+n3J1bCW6SmiW9IukASa2SnpN0eLXrNVBvkiZLmh0sj5D0kqTDJV0p6Wsx6x8etFmbpBlBWzZX+3MMlJukVZLGZ5T9QNL8YHm+pGuC5TMk/VGSSTpe0pPVrv9AuwX7u42S9mebqo2bpBMlzZb0fKSsoG1I0lhJrwZ/xwTLY6r92RrplqWdTpXUEixfE2mn6dH1Ml7nKUknBG34R0mnV/uzNdItSzsVtK/juLB6bZXx+I8kXREss01Vr52yHZeX/XeKzG7pHCep3d1fdfcOSbdJOrvKdRqw3H2Duy8JlndIWiFpSo6nnC3pNnff6+6vSWpXsk1RPWdLuilYvknSxyLlN3vSE5JGm9nkalRwADtZ0ivuvjrHOmxTFeTuj0h6K6O40G3oNEn3u/tb7r5V0v2S5pa/9gNHXDu5+33u3hXcfULSfrleI2irke7+uCeP/m5WT9uiBLJsT9lk29dxXFgBudoqyM6eK+nWXK/BNlV+OY7Ly/47RbBbOlMkvR65v1a5gytUiJlNlzRL0pNB0aVBl4gbwu4Sov2qzSXdZ2ZPm9lFQdkkd98gJXeSkiYG5bRV9c1T+sED21RtKnQbos2q7wtKZjNCM8zsGTN72MzeH5RNUbJtQrRT5RSyr2N7qr73S9rk7i9HytimqizjuLzsv1MEu6UT17efqa6rzMyGS7pd0lfdfbukX0g6UNJRkjYo2b1Fov2qbY67z5Z0uqRLzOzEHOvSVlVkZq2SzpL0P0ER21T9ydY2tFkVmdk/SeqSdEtQtEHSNHefJekySb81s5Ginaql0H0d7VR95yv9xCzbVJXFHJdnXTWmrF/bFcFu6ayVNDVyfz9J66tUF0gys0FKblC3uPsdkuTum9y9290Tkv5TPd0qab8qcvf1wd/Nkv6gZLtsCrsnB383B6vTVtV1uqQl7r5JYpuqcYVuQ7RZlQSTrHxE0qeDbpQKusW+GSw/reT4z4OVbKdoV2faqQL6sa9je6oiM2uR9HFJvwvL2KaqK+64XBX4nSLYLZ1Fkg4ysxlB5mOepIVVrtOAFYzT+JWkFe5+baQ8OrbzbyWFs/ctlDTPzNrMbIakg5ScrABlZmbDzGxEuKzkZC3PK9km4Sx7F0q6K1heKOmCYKa+4yVtC7vAoCLSzpSzTdW0QreheyWdamZjgi6apwZlKCMzmyvpG5LOcvfdkfIJZtYcLB+g5Db0atBWO8zs+OC37gL1tC3KpB/7Oo4Lq+sUSS+6e6p7MttU9WQ7LlcFfqdaSvg5BjR37zKzS5X8wpsl3eDuy6tcrYFsjqTPSlpmwZTzki6XdL6ZHaVkl4dVkv5ektx9uZn9XtILSnYju8Tduyte64FpkqQ/JPeDapH0W3f/k5ktkvR7M/uipDWSPhmsf4+Ss/S1S9ot6fOVr/LAZGZDJX1YwXYT+AHbVPWZ2a2STpI03szWSvq2pO+rgG3I3d8ys6uVPEiXpO+4e76T9CAPWdppgZIz+d4f7AefcPeLlZxl9jtm1iWpW9LFkfb4kqRfSxqi5Bjf6DhfFClLO51U6L6O48Lyi2srd/+Ves8tIbFNVVO24/Ky/05Z0FsGAAAAAICGQTdmAAAAAEDDIdgFAAAAADQcgl0AAAAAQMMh2AUAAAAANByCXQAAAABAwyHYBQAMeGbWbWbPRm7z+1j/YjO7oATvu8rMxhf7OiWox5Vm9rVq1wMAgFLiOrsAAEh73P2ofFd291+WszL1xJIXhzV3T1S7LgAARJHZBQAgiyDzeo2ZPRXcZgblqUyomX3ZzF4ws6VmdltQNtbM7gzKnjCzI4LycWZ2n5k9Y2bXS7LIe30meI9nzex6M2vOUp+rzGyJmS0zs0Mz6xPcf97Mpge3F83sv4KyW8zsFDN7zMxeNrPjIi9/pJk9GJT/XeS1vm5mi4LPclVQNt3MVpjZzyUtkTS1dN86AAClQbALAIA0JKMb83mRx7a7+3GSfibpxzHPnS9plrsfIenioOwqSc8EZZdLujko/7akR919lqSFkqZJkpkdJuk8SXOCDHO3pE9nqesWd58t6ReS8ul6PFPSTyQdIelQSZ+S9L7guZdH1jtC0pmSTpB0hZnta2anSjpI0nGSjpJ0tJmdGKx/iKSb3X2Wu6/Oox4AAFQU3ZgBAMjdjfnWyN/rYh5fKukWM7tT0p1B2fsknSNJ7v5gkNEdJelESR8Pyu82s63B+idLOlrSomSvYA2RtDlLfe4I/j4dvlYfXnP3ZZJkZsslPeDubmbLJE2PrHeXu++RtMfMHlIywH2fpFMlPROsM1zJ4HeNpNXu/kQe7w8AQFUQ7AIAkJtnWQ6dqWQQe5akb5nZuxTpnhzz3LjXMEk3ufuCPOqzN/jbrZ7f8S6l99YaHLO+JCUi9xNKPw7IrJcH9fqeu1+fVlmz6ZJ25VFXAACqhm7MAADkdl7k7+PRB8ysSdJUd39I0j9KGq1k9vMRBd2QzewkJbseb88oP13SmOClHpD0CTObGDw21sz2L6COqyTNDp47W9KMgj5h0tlmNtjMxkk6SdIiSfdK+oKZDQ9ee0pYRwAAah2ZXQAAgjG7kft/cvfw8kNtZvakkieIz894XrOk/w66KJuk69z9bTO7UtKNZrZU0m5JFwbrXyXpVjNbIulhJbsDy91fMLNvSrovCKA7JV0iKd+xsLdLuiD4DIskvZTvB494StLdSo4jvtrd10taH4wnfjzoXr1T0meUzCoDAFDTzD2uNxUAADCzVZKOcfct1a4LAAAoDN2YAQAAAAANh8wuAAAAAKDhkNkFAAAAADQcgl0AAAAAQMMh2AUAAAAANByCXQAAAABAwyHYBQAAAAA0HIJdAAAAAEDD+X9xDd4uYQsfHQAAAABJRU5ErkJggg==\n",
475 |       "text/plain": [
476 |        "<matplotlib.figure.Figure at 0x7fc1dc024710>"
477 |       ]
478 |      },
479 |      "metadata": {
480 |       "needs_background": "light"
481 |      },
482 |      "output_type": "display_data"
483 |     }
484 |    ],
485 |    "source": [
486 |     "# plot the scores\n",
487 |     "fig = plt.figure(figsize=(16, 12))\n",
488 |     "ax = fig.add_subplot(111)\n",
489 |     "plt.plot(np.arange(len(scores)), scores)\n",
490 |     "plt.xlabel('Episode number')\n",
491 |     "plt.ylabel('Score')\n",
492 |     "plt.show()"
493 |    ]
494 |   }
495 |  ],
496 |  "metadata": {
497 |   "kernelspec": {
498 |    "display_name": "Python 3",
499 |    "language": "python",
500 |    "name": "python3"
501 |   },
502 |   "language_info": {
503 |    "codemirror_mode": {
504 |     "name": "ipython",
505 |     "version": 3
506 |    },
507 |    "file_extension": ".py",
508 |    "mimetype": "text/x-python",
509 |    "name": "python",
510 |    "nbconvert_exporter": "python",
511 |    "pygments_lexer": "ipython3",
512 |    "version": "3.6.3"
513 |   }
514 |  },
515 |  "nbformat": 4,
516 |  "nbformat_minor": 2
517 | }
518 | 


--------------------------------------------------------------------------------
/p3_collab_compet/config.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"exp_name": "DDPGAgents_exp", 
 3 | 	"cuda": true, 
 4 | 	"gpu": 0,
 5 | 
 6 | 	"optimizer_actor": {
 7 | 		"optimizer_type": "Adam", 
 8 | 		"betas": [0.9, 0.999], 
 9 | 		"optimizer_params": {
10 | 			"lr": 1e-4, 
11 | 			"eps": 1e-7, 
12 | 			"weight_decay": 0 
13 | 		}
14 | 	},
15 | 
16 | 	"optimizer_critic": {
17 | 		"optimizer_type": "Adam", 
18 | 		"betas": [0.9, 0.999], 
19 | 		"optimizer_params": {
20 | 			"lr": 1e-3, 
21 | 			"eps": 1e-7, 
22 | 			"weight_decay": 0
23 | 		}
24 | 	},
25 | 
26 | 	"DDPG": {
27 | 		"num_agents": 2,
28 | 		"gamma": 0.99,
29 | 		"tau": 0.001,
30 | 		"buffer_size": 10e6
31 | 	},
32 | 
33 | 	"architecture": {
34 | 		"fc1_units": 250, 
35 | 		"fc2_units": 100
36 | 	},
37 | 
38 | 	"trainer" : {
39 | 		"num_episodes": 15000, 
40 | 		"batch_size": 128, 
41 | 		"save_dir": "./saved/", 
42 | 		"save_freq": 1000
43 | 	}
44 | }


--------------------------------------------------------------------------------
/p3_collab_compet/images/tennis_gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/images/tennis_gif.gif


--------------------------------------------------------------------------------
/p3_collab_compet/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/report.pdf


--------------------------------------------------------------------------------
/p3_collab_compet/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/requirements.txt


--------------------------------------------------------------------------------
/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_actor_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_actor_solved.pth


--------------------------------------------------------------------------------
/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_critic_solved.pth:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vmelan/DRLND-udacity/8d2a38d2894b0f692fc6ffbe652eb94440b516de/p3_collab_compet/saved/DDPGAgents_exp/checkpoint_critic_solved.pth


--------------------------------------------------------------------------------
/p3_collab_compet/utils.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import torch
 3 | 
 4 | 
 5 | def pick_device(config, logger):
 6 | 	""" Pick device """ 
 7 | 	if config["cuda"] and not torch.cuda.is_available():
 8 | 		logger.warning("Warning: There's no CUDA support on this machine,"
 9 | 			"training is performed on cpu.")
10 | 		device = torch.device("cpu")
11 | 	elif not config["cuda"] and torch.cuda.is_available():
12 | 		logger.info("Training is performed on cpu by user's choice")
13 | 		device = torch.device("cpu")
14 | 	elif not config["cuda"] and not torch.cuda.is_available():
15 | 		logger.info("Training on cpu")
16 | 		device = torch.device("cpu")
17 | 	else:
18 | 		logger.info("Training on gpu")
19 | 		device = torch.device("cuda:" + str(config["gpu"]))	
20 | 
21 | 	return device 
22 | 
23 | def ensure_dir(path):
24 | 	if not os.path.exists(path):
25 | 		os.makedirs(path)


--------------------------------------------------------------------------------