├── Average Rewards.png ├── README.md ├── Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Q_learning.ipynb ├── Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Q_learning.py ├── Total Rewards.png ├── a simple 4 by 4 Gridworld.png └── simple 4 by 4 Gridworld.png /Average Rewards.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Elktrn/Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python/09b4b46e556f1b313bb6c0f6908234279cdc896c/Average Rewards.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Qlearning 2 | solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Qlearning Temporal difference method Reinforcement Learning 3 | 4 | **!!!! its not Deep Q-learning implementation!!! for Deep Q-learning version search my github repo** 5 | ![4*4 gridworld](https://github.com/elktrn/Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_SARSA-in-python/blob/main/a%20simple%204%20by%204%20Gridworld.png) 6 | this program is using Reinfrocement learning to solve a 4*4 gridworld like frozen lake enviroment in open ai gym 7 | the method used is policy iteration whitch is one of fundamental manners of Dynamic Programing 8 | 9 | | S | O | O | O | 10 | | O | O | O | * | 11 | | O | * | O | O | 12 | | O | * | O | T | 13 | 14 | 15 | S= start cell 16 | O= normal cells 17 | *= penalized cells 18 | T= terminate cell 19 | 20 | our agent goal is to find policy to go from S(start) cell to T(goal) cell with maximum reward(or minimum negative reward) 21 | valid actions are storend in GridWorld actions array. 22 | positive and negative rewards in each cell is stored in Gridworld "Rewards" dictionary and can be modified by user .the current rewards for *(hole) cells ant T(goal) cell has been set to: 23 | self.rewards = {(3, 3): 5, (1, 3): -2, (2, 1): -2, (3, 1): -2} 24 | for example reward to go in (3,3) in enviroment witch is the goal will be +5 so agent gets +5 reward whenever go to cell (3,3) 25 | the size of Gridworld can be changed in GridWorld calss by adding space actions 26 | ![Average Rewards](https://github.com/Elktrn/Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Qlearning/blob/main/Average%20Rewards.png) 27 | ![Total Rewards](https://github.com/Elktrn/Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Qlearning/blob/main/Total%20Rewards.png) 28 | *************************** 29 | Algorithm Flow 30 | *************************** 31 | first we initialize a random policy that indicate prefered moves in every cell: 32 | 33 | | D | | L | | R | | D | 34 | ---------------------------- 35 | | U | | U | | R | | D | 36 | ---------------------------- 37 | | D | | R | | R | | U | 38 | ---------------------------- 39 | | U | | L | | R | 40 | ---------------------------- 41 | 42 | U = going up 43 | D = going down 44 | L = going left 45 | R = going right 46 | 47 | and we initialize Q table like: 48 | 49 | (0, 0): {'D': 0, 'R': 0}, 50 | (0, 1): {'L': 0, 'D': 0, 'R': 0}, 51 | (0, 2): {'L': 0, 'D': 0, 'R': 0}, 52 | (0, 3): {'L': 0, 'D': 0}, 53 | (1, 0): {'U': 0, 'D': 0, 'R': 0}, 54 | (1, 1): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, 55 | (1, 2): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, 56 | (1, 3): {'U': 0, 'L': 0, 'D': 0}, 57 | (2, 0): {'U': 0, 'D': 0, 'R': 0}, 58 | (2, 1): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, 59 | (2, 2): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, 60 | (2, 3): {'U': 0, 'L': 0, 'D': 0}, 61 | (3, 0): {'U': 0, 'R': 0}, 62 | (3, 1): {'U': 0, 'L': 0, 'R': 0}, 63 | (3, 2): {'U': 0, 'L': 0, 'R': 0}} 64 | 65 | 66 | 67 | 68 | *************************** 69 | Output 70 | *************************** 71 | 72 | -------------------------------- 73 | step:0 74 | -------------------------------- 75 | | R | | L | | L | | L | 76 | ---------------------------- 77 | | U | | U | | U | | U | 78 | ---------------------------- 79 | | D | | U | | U | | U | 80 | ---------------------------- 81 | | U | | U | | U | 82 | ---------------------------- 83 | 84 | 85 | 86 | 87 | 88 | -------------------------------- 89 | step:200 90 | -------------------------------- 91 | | R | | R | | D | | L | 92 | ---------------------------- 93 | | D | | R | | D | | L | 94 | ---------------------------- 95 | | U | | R | | R | | D | 96 | ---------------------------- 97 | | U | | R | | R | 98 | ---------------------------- 99 | 100 | 101 | 102 | 103 | 104 | -------------------------------- 105 | step:400 106 | -------------------------------- 107 | | R | | R | | D | | L | 108 | ---------------------------- 109 | | R | | R | | D | | L | 110 | ---------------------------- 111 | | D | | R | | R | | D | 112 | ---------------------------- 113 | | U | | R | | R | 114 | ---------------------------- 115 | 116 | 117 | 118 | 119 | 120 | -------------------------------- 121 | step:600 122 | -------------------------------- 123 | | R | | R | | D | | L | 124 | ---------------------------- 125 | | R | | R | | D | | L | 126 | ---------------------------- 127 | | D | | R | | R | | D | 128 | ---------------------------- 129 | | U | | R | | R | 130 | ---------------------------- 131 | 132 | 133 | 134 | 135 | 136 | -------------------------------- 137 | step:800 138 | -------------------------------- 139 | | R | | R | | D | | L | 140 | ---------------------------- 141 | | R | | R | | D | | L | 142 | ---------------------------- 143 | | D | | R | | R | | D | 144 | ---------------------------- 145 | | U | | R | | R | 146 | ---------------------------- 147 | 148 | 149 | 150 | 151 | 152 | -------------------------------- 153 | step:1000 154 | -------------------------------- 155 | | R | | R | | D | | L | 156 | ---------------------------- 157 | | R | | R | | D | | L | 158 | ---------------------------- 159 | | D | | R | | R | | D | 160 | ---------------------------- 161 | | U | | R | | R | 162 | ---------------------------- 163 | 164 | 165 | 166 | 167 | 168 | -------------------------------- 169 | step:1200 170 | -------------------------------- 171 | | R | | R | | D | | L | 172 | ---------------------------- 173 | | R | | R | | D | | L | 174 | ---------------------------- 175 | | D | | R | | R | | D | 176 | ---------------------------- 177 | | U | | R | | R | 178 | ---------------------------- 179 | 180 | 181 | 182 | 183 | 184 | -------------------------------- 185 | step:1400 186 | -------------------------------- 187 | | R | | R | | D | | L | 188 | ---------------------------- 189 | | R | | R | | D | | L | 190 | ---------------------------- 191 | | D | | R | | R | | D | 192 | ---------------------------- 193 | | U | | R | | R | 194 | ---------------------------- 195 | 196 | 197 | 198 | 199 | 200 | -------------------------------- 201 | step:1600 202 | -------------------------------- 203 | | R | | R | | D | | L | 204 | ---------------------------- 205 | | R | | R | | D | | L | 206 | ---------------------------- 207 | | D | | R | | R | | D | 208 | ---------------------------- 209 | | U | | R | | R | 210 | ---------------------------- 211 | 212 | 213 | 214 | 215 | 216 | -------------------------------- 217 | step:1800 218 | -------------------------------- 219 | | R | | R | | D | | L | 220 | ---------------------------- 221 | | R | | R | | D | | L | 222 | ---------------------------- 223 | | D | | R | | R | | D | 224 | ---------------------------- 225 | | U | | R | | R | 226 | ---------------------------- 227 | 228 | 229 | 230 | 231 | 232 | -------------------------------- 233 | step:1800 234 | -------------------------------- 235 | | R | | R | | D | | L | 236 | ---------------------------- 237 | | R | | R | | D | | L | 238 | ---------------------------- 239 | | D | | R | | R | | D | 240 | ---------------------------- 241 | | U | | R | | R | 242 | ---------------------------- 243 | 244 | 245 | exploited:12482 explored:120 246 | -------------------------------------------------------------------------------- /Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Q_learning.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Q_learning.ipynb", 7 | "provenance": [], 8 | "collapsed_sections": [] 9 | }, 10 | "kernelspec": { 11 | "name": "python3", 12 | "display_name": "Python 3" 13 | }, 14 | "language_info": { 15 | "name": "python" 16 | } 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "code", 21 | "execution_count": 36, 22 | "metadata": { 23 | "id": "QBjej28oq5xO" 24 | }, 25 | "outputs": [], 26 | "source": [ 27 | "import numpy as np\n", 28 | "import matplotlib.pyplot as plt\n", 29 | "class GridWorld:\n", 30 | " def __init__(self):\n", 31 | " # S O O O\n", 32 | " # O O O *\n", 33 | " # O * O O\n", 34 | " # O * 0 T\n", 35 | " self.qTable = None\n", 36 | " self.actionSpace = ('U', 'D', 'L', 'R')\n", 37 | " self.actions = {\n", 38 | " (0, 0): ('D', 'R'),\n", 39 | " (0, 1): ('L', 'D', 'R'),\n", 40 | " (0, 2): ('L', 'D', 'R'),\n", 41 | " (0, 3): ('L', 'D'),\n", 42 | " (1, 0): ('U', 'D', 'R'),\n", 43 | " (1, 1): ('U', 'L', 'D', 'R'),\n", 44 | " (1, 2): ('U', 'L', 'D', 'R'),\n", 45 | " (1, 3): ('U', 'L', 'D'),\n", 46 | " (2, 0): ('U', 'D', 'R'),\n", 47 | " (2, 1): ('U', 'L', 'D', 'R'),\n", 48 | " (2, 2): ('U', 'L', 'D', 'R'),\n", 49 | " (2, 3): ('U', 'L', 'D'),\n", 50 | " (3, 0): ('U', 'R'),\n", 51 | " (3, 1): ('U', 'L', 'R'),\n", 52 | " (3, 2): ('U', 'L', 'R')\n", 53 | " }\n", 54 | " self.rewards = {(3, 3): 0.5, (1, 3): -0.5, (2, 1):-0.5, (3, 1):-0.5}\n", 55 | " self.explored = 0\n", 56 | " self.exploited = 0\n", 57 | " self.initialQtable()\n", 58 | "\n", 59 | " def initialQtable(self):\n", 60 | " self.qTable = {}\n", 61 | " for state in self.actions:\n", 62 | " self.qTable[state]={}\n", 63 | " for move in self.actions[state]:\n", 64 | " self.qTable[state][move]=0\n", 65 | " print(self.qTable)\n", 66 | "\n", 67 | " def updateQtable(self, newQ,updateRate=0.05):\n", 68 | " for state in self.qTable:\n", 69 | " for action in self.qTable[state]:\n", 70 | " self.qTable[state][action] = self.qTable[state][action]+(updateRate*(newQ[state][action]-self.qTable[state][action]))\n", 71 | " \n", 72 | " def getRandomPolicy(self):\n", 73 | " policy = {}\n", 74 | " for state in self.actions:\n", 75 | " policy[state] = np.random.choice(self.actions[state])\n", 76 | " return policy\n", 77 | "\n", 78 | " def reset(self):\n", 79 | " return (0, 0)\n", 80 | " \n", 81 | " def is_terminal(self, s):\n", 82 | " return s not in self.actions\n", 83 | "\n", 84 | " def getNewState(self,state,action):\n", 85 | " i, j = zip(state)\n", 86 | " row = int(i[0])\n", 87 | " column = int(j[0])\n", 88 | " if action == 'U':\n", 89 | " row -= 1\n", 90 | " elif action == 'D':\n", 91 | " row += 1\n", 92 | " elif action == 'L':\n", 93 | " column -= 1\n", 94 | " elif action == 'R':\n", 95 | " column += 1\n", 96 | " return row,column\n", 97 | "\n", 98 | " def chooseAction(self, state, policy, exploreRate=0.01):\n", 99 | " if exploreRate > np.random.rand():\n", 100 | " self.explored += 1\n", 101 | " return np.random.choice(self.actions[state])\n", 102 | " self.exploited += 1\n", 103 | " return policy[state]\n", 104 | "\n", 105 | " def move(self, state, policy, exploreRate):\n", 106 | " action = self.chooseAction(state, policy, exploreRate)\n", 107 | " row,column=self.getNewState(state,action)\n", 108 | " if (row, column) in self.rewards:\n", 109 | " return action,(row, column),self.rewards[(row, column)]\n", 110 | " return action,(row, column),-0.01\n", 111 | " \n", 112 | " def printPolicy(self, policy):\n", 113 | " line = \"\"\n", 114 | " counter = 0\n", 115 | " for item in policy:\n", 116 | " line += f\" | {policy[item]} | \"\n", 117 | " counter += 1\n", 118 | " if counter > 3:\n", 119 | " print(line)\n", 120 | " print(\"----------------------------\")\n", 121 | " counter = 0\n", 122 | " line = \"\"\n", 123 | " print(line)\n", 124 | " print(\"----------------------------\")" 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "source": [ 130 | "env= GridWorld()\n", 131 | "policy = env.getRandomPolicy()\n", 132 | "# policy = {(0, 0): 'R', (0, 1): 'R', (0, 2): 'D', (0, 3): 'L', (1, 0): 'U', (1, 1): 'R', (1, 2): 'D', (1, 3): 'D'\n", 133 | "# ,(2, 0): 'D', (2, 1): 'R', (2, 2): 'R', (2, 3): 'D', (3, 0): 'R', (3, 1): 'R', (3, 2): 'R'}\n", 134 | "# env.printPolicy(policy)\n", 135 | "averageRewards=[]\n", 136 | "rewards=[]\n", 137 | "alpha=0.1\n", 138 | "for i in range(1,2002):\n", 139 | " state = env.reset()\n", 140 | " stepCounts=0\n", 141 | " episodeReward=0\n", 142 | " while (not env.is_terminal(state)) and (stepCounts<20):\n", 143 | " action, nextState, reward = env.move(state, policy,0.01)\n", 144 | " stepCounts += 1\n", 145 | " targetQ=reward\n", 146 | " episodeReward+=reward\n", 147 | " if not env.is_terminal(nextState):\n", 148 | " targetQ=reward+(0.9*env.qTable[nextState][max(env.qTable[nextState], key=env.qTable[nextState].get)])\n", 149 | " env.qTable[state][action]=env.qTable[state][action]+alpha*(targetQ-env.qTable[state][action])\n", 150 | " state = nextState\n", 151 | " rewards.append(episodeReward)\n", 152 | " averageRewards.append(sum(rewards)/i)\n", 153 | " for state in policy:\n", 154 | " policy[state] = max(env.qTable[state], key=env.qTable[state].get)\n", 155 | " if (i-1)%200==0:\n", 156 | " print(f\"\\n\\n\\n step:{i-1} - Average reward so far:{sum(rewards)/i}\")\n", 157 | " print(f\"\\n\\n\\n step:{i-1}\")\n", 158 | " env.printPolicy(policy)\n", 159 | " print(\"\\n\")\n", 160 | "print(f\"exploited:{env.exploited} explored:{env.explored}\")\n", 161 | "\n", 162 | "plt.title(f'Total Rewards')\n", 163 | "plt.yscale('symlog')\n", 164 | "plt.plot(rewards)\n", 165 | "plt.savefig(\"Total Rewards\",dpi=200)\n", 166 | "plt.clf()\n", 167 | "plt.title(f'Average Rewards')\n", 168 | "plt.yscale('symlog')\n", 169 | "plt.plot(averageRewards)\n", 170 | "plt.savefig(\"Average Rewards\",dpi=200)" 171 | ], 172 | "metadata": { 173 | "colab": { 174 | "base_uri": "https://localhost:8080/", 175 | "height": 1000 176 | }, 177 | "id": "w1iSio8wsj2w", 178 | "outputId": "3bff8cdc-cd6f-4051-9afa-90b42cba37a4" 179 | }, 180 | "execution_count": 37, 181 | "outputs": [ 182 | { 183 | "output_type": "stream", 184 | "name": "stdout", 185 | "text": [ 186 | "{(0, 0): {'D': 0, 'R': 0}, (0, 1): {'L': 0, 'D': 0, 'R': 0}, (0, 2): {'L': 0, 'D': 0, 'R': 0}, (0, 3): {'L': 0, 'D': 0}, (1, 0): {'U': 0, 'D': 0, 'R': 0}, (1, 1): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, (1, 2): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, (1, 3): {'U': 0, 'L': 0, 'D': 0}, (2, 0): {'U': 0, 'D': 0, 'R': 0}, (2, 1): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, (2, 2): {'U': 0, 'L': 0, 'D': 0, 'R': 0}, (2, 3): {'U': 0, 'L': 0, 'D': 0}, (3, 0): {'U': 0, 'R': 0}, (3, 1): {'U': 0, 'L': 0, 'R': 0}, (3, 2): {'U': 0, 'L': 0, 'R': 0}}\n", 187 | "\n", 188 | "\n", 189 | "\n", 190 | " step:0 - Average reward so far:-0.20000000000000004\n", 191 | "\n", 192 | "\n", 193 | "\n", 194 | " step:0\n", 195 | " | D | | L | | L | | L | \n", 196 | "----------------------------\n", 197 | " | U | | L | | U | | U | \n", 198 | "----------------------------\n", 199 | " | U | | U | | U | | U | \n", 200 | "----------------------------\n", 201 | " | U | | U | | U | \n", 202 | "----------------------------\n", 203 | "\n", 204 | "\n", 205 | "\n", 206 | "\n", 207 | "\n", 208 | " step:200 - Average reward so far:0.17547263681592024\n", 209 | "\n", 210 | "\n", 211 | "\n", 212 | " step:200\n", 213 | " | R | | D | | D | | L | \n", 214 | "----------------------------\n", 215 | " | R | | R | | D | | D | \n", 216 | "----------------------------\n", 217 | " | D | | R | | D | | D | \n", 218 | "----------------------------\n", 219 | " | U | | R | | R | \n", 220 | "----------------------------\n", 221 | "\n", 222 | "\n", 223 | "\n", 224 | "\n", 225 | "\n", 226 | " step:400 - Average reward so far:0.31107231920199635\n", 227 | "\n", 228 | "\n", 229 | "\n", 230 | " step:400\n", 231 | " | R | | D | | D | | L | \n", 232 | "----------------------------\n", 233 | " | R | | R | | D | | D | \n", 234 | "----------------------------\n", 235 | " | D | | R | | D | | D | \n", 236 | "----------------------------\n", 237 | " | U | | R | | R | \n", 238 | "----------------------------\n", 239 | "\n", 240 | "\n", 241 | "\n", 242 | "\n", 243 | "\n", 244 | " step:600 - Average reward so far:0.3545923460898478\n", 245 | "\n", 246 | "\n", 247 | "\n", 248 | " step:600\n", 249 | " | R | | D | | D | | L | \n", 250 | "----------------------------\n", 251 | " | R | | R | | D | | D | \n", 252 | "----------------------------\n", 253 | " | D | | R | | D | | D | \n", 254 | "----------------------------\n", 255 | " | U | | R | | R | \n", 256 | "----------------------------\n", 257 | "\n", 258 | "\n", 259 | "\n", 260 | "\n", 261 | "\n", 262 | " step:800 - Average reward so far:0.37714107365792293\n", 263 | "\n", 264 | "\n", 265 | "\n", 266 | " step:800\n", 267 | " | R | | D | | D | | L | \n", 268 | "----------------------------\n", 269 | " | R | | R | | D | | D | \n", 270 | "----------------------------\n", 271 | " | D | | R | | D | | D | \n", 272 | "----------------------------\n", 273 | " | U | | R | | R | \n", 274 | "----------------------------\n", 275 | "\n", 276 | "\n", 277 | "\n", 278 | "\n", 279 | "\n", 280 | " step:1000 - Average reward so far:0.3906193806193747\n", 281 | "\n", 282 | "\n", 283 | "\n", 284 | " step:1000\n", 285 | " | R | | D | | D | | L | \n", 286 | "----------------------------\n", 287 | " | R | | R | | D | | D | \n", 288 | "----------------------------\n", 289 | " | D | | R | | D | | D | \n", 290 | "----------------------------\n", 291 | " | U | | R | | R | \n", 292 | "----------------------------\n", 293 | "\n", 294 | "\n", 295 | "\n", 296 | "\n", 297 | "\n", 298 | " step:1200 - Average reward so far:0.39998334721065104\n", 299 | "\n", 300 | "\n", 301 | "\n", 302 | " step:1200\n", 303 | " | R | | D | | D | | L | \n", 304 | "----------------------------\n", 305 | " | R | | R | | D | | D | \n", 306 | "----------------------------\n", 307 | " | D | | R | | D | | D | \n", 308 | "----------------------------\n", 309 | " | U | | R | | R | \n", 310 | "----------------------------\n", 311 | "\n", 312 | "\n", 313 | "\n", 314 | "\n", 315 | "\n", 316 | " step:1400 - Average reward so far:0.40705210563882704\n", 317 | "\n", 318 | "\n", 319 | "\n", 320 | " step:1400\n", 321 | " | R | | D | | D | | L | \n", 322 | "----------------------------\n", 323 | " | R | | R | | D | | D | \n", 324 | "----------------------------\n", 325 | " | D | | R | | D | | D | \n", 326 | "----------------------------\n", 327 | " | U | | R | | R | \n", 328 | "----------------------------\n", 329 | "\n", 330 | "\n", 331 | "\n", 332 | "\n", 333 | "\n", 334 | " step:1600 - Average reward so far:0.41175515302936005\n", 335 | "\n", 336 | "\n", 337 | "\n", 338 | " step:1600\n", 339 | " | R | | D | | D | | L | \n", 340 | "----------------------------\n", 341 | " | R | | R | | D | | D | \n", 342 | "----------------------------\n", 343 | " | D | | R | | D | | D | \n", 344 | "----------------------------\n", 345 | " | U | | R | | R | \n", 346 | "----------------------------\n", 347 | "\n", 348 | "\n", 349 | "\n", 350 | "\n", 351 | "\n", 352 | " step:1800 - Average reward so far:0.4154247640199969\n", 353 | "\n", 354 | "\n", 355 | "\n", 356 | " step:1800\n", 357 | " | R | | D | | D | | L | \n", 358 | "----------------------------\n", 359 | " | R | | R | | D | | D | \n", 360 | "----------------------------\n", 361 | " | D | | R | | D | | D | \n", 362 | "----------------------------\n", 363 | " | U | | R | | R | \n", 364 | "----------------------------\n", 365 | "\n", 366 | "\n", 367 | "\n", 368 | "\n", 369 | "\n", 370 | " step:2000 - Average reward so far:0.41885057471265524\n", 371 | "\n", 372 | "\n", 373 | "\n", 374 | " step:2000\n", 375 | " | R | | D | | D | | L | \n", 376 | "----------------------------\n", 377 | " | R | | R | | D | | D | \n", 378 | "----------------------------\n", 379 | " | D | | R | | D | | D | \n", 380 | "----------------------------\n", 381 | " | U | | R | | R | \n", 382 | "----------------------------\n", 383 | "\n", 384 | "\n", 385 | "exploited:12529 explored:97\n" 386 | ] 387 | }, 388 | { 389 | "output_type": "display_data", 390 | "data": { 391 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAbUAAAEICAYAAADY/mp2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deZRcdZ338fe3lt7T3SEbIYEQQtgMkkQM4iDgoAjoAOpzfEDOsCriIx4dH8fBx3NcZnTmGec4Z3R0cEVcQDblkWHUwYXFhS1BCGsgOwnZl06ql+ququ/zx72Vru70Up10V/Wt+3mdU6du3a1+datTn/x+93d/19wdERGRWpCodgFERETGi0JNRERqhkJNRERqhkJNRERqhkJNRERqhkJNRERqhkJNRCLPzNzMjq92OaT6FGoi48zMHjKzPWZWX+2yHC4z+7yZ9ZlZxsz2mtmfzOzMapdLZDgKNZFxZGbHAm8BHLh4AvafGu99luFOd28BpgMPAndXoQxA1T6/RIhCTWR8XQk8BtwKXAVgZvVhLWdRcSUzm2Fm3WY2M3z9LjN7uqQ29PqSddeb2d+Z2Uqg08xSZnaTma0xs/1m9oKZvbtk/aSZfcXMdprZOjO7MWyeS4XL28zse2a2xcw2m9kXzSw52gdz9xxwGzDHzGaMti8z22BmbwinrwjL8Lrw9XVm9v/C6WVm9mj42beY2dfNrK7k87iZfcTMXgFeCef9bbjua2Z2bWk5zeyi8JjsD8v0yXK/PIk+hZrI+LqS4If/NuAdZjbL3bPAz4DLS9Z7H/Cwu283syXALcCHgGnAt4D7BjVfXg68E2gPw2UNQY2wDfgC8GMzmx2u+0HgQmAxsBS4dFAZbwVywPHAEuB84AOjfbAwaK4EdgF7ytjXw8C54fQ5wFrg7JLXD4fTeeBvCGqCZwLnAf9r0NtfCpwBnGJmFwCfBN4OLATeNmjd7wEfcvcpwCLgd6N9Nqkh7q6HHnqMwwM4C+gDpoevXwL+Jpx+G7CmZN0/AleG0zcD/zBoX6uAc8Lp9cC1o7z308Al4fTvCH7UKXlvB1LALCALNJYsvxx4cJj9fh7oBfYShM8u4Nxw2Yj7Aq4D7gunXyQIuzvC1xuApcO858eBe0teO/CXJa9vAf5vyesTwnWOD19vJPgPQmu1/yb0qPxDNTWR8XMV8IC77wxf3x7Og+BcVJOZnRGed1sM3Bsumwf877D5ba+Z7QWOBo4q2ferpW9kZleWNFfuJaiRTA8XHzVo/dLpeUAa2FKy7beAmSN8rrvcvZ0gxJ4D3lDmvh4G3hLWIJPAXcBfhJ+/jSCIMbMTzOx+M9tqZvuAfyz5LEN9hsGfb8Ogdd8LXARsMLOH1bElXnTSVWQcmFkjQZNi0sy2hrPrgXYzO83dnzGzuwhqMtuA+919f7jeq8CX3P1LI7zFgdtpmNk84DsEzXSPunvezJ4GLFxlCzC3ZNujS6ZfJahdTfegGbNs7r7TzK4HlpvZ7aPty91Xm1kX8FHgEXffFx6b64E/uHshXPVm4M/A5e6+38w+DvyP4T5/+PlKP9Mxg973SeASM0sDNxKEaen6UsNUUxMZH5cSNM+dQlALWwycDPye4DwUBDW3/wlcEU4XfQe4IazFmZk1m9k7zWzKMO/VTPAjvwPAzK4hqKkV3QV8zMzmmFk78HfFBe6+BXgA+IqZtZpZwswWmNk55XxId18F/DfwqTL39TBBsBTPnz006DXAFGAfkDGzk4APj1KMu4CrzewUM2sCPldcYGZ1YaeUNnfvC/dbGG5HUnsUaiLj4yrg++6+0d23Fh/A14ErzCzl7o8DnQTNZ78sbujuywk6d3ydoAPGauDq4d7I3V8AvgI8SlDrO5XgHF3RdwjCZiVBDegXBJ058uHyK4E64IXw/e4BZlO+fwGuD3tujravhwlC65FhXkPQ6eP9wP6w7HeO9Obu/kvg3wjOHa7m4I4gfw2sD5sybyD4T4TEhLnrJqEitczMLgS+6e7zql0WkYmmmppIjTGzxvBarZSZzSFonrt3tO1EaoFCbRyZ2XHhhaj3VLssEmtGcO3aHoLmxxeBz1a1RCIVMuZQs2B0g2fD7sTLh1nnFjPbbmbPlbutmbWb2T1m9pKZvWhmZ5pZg5k9YWbPmNnzZvaFcN0Tw30UH/vCHlOHZITyXmBmq8xstZndNNp+3H2tu193qOUQGQ/u3uXub3T3Ke4+092vcfd91S6XSCWM+Zyama0HTi+5Fmeodc4GMsAP3X1ROdua2Q+A37v7d8ORC5qADqDZ3TNh99w/AB9z98dKtksCm4Ez3H3DoH3OBLpLuk5jZse7++rRyhvu92WCUQs2AU8SdDl+wcxOBf5p0Ee41t23h9ve4+6DuySLiMgEm5Dr1Nz9kfACy7KYWRvB8DlXh9v3EoxiAEHYQHCRZ5qB16tAcK3OmsGBFjqHoKv0Re6eNbMPAu8hGEJotPIuA1a7+9qwjHcAlwAvuPuzwLvK/Xylwut8rgdobm5+w0knnXQouxERia0VK1bsdPcZQy07lFBz4AEzc+Bb7v7tcdh2PsE1N983s9OAFQQ1ss6wxrSCYGy5b4TdoktdBvxkyDdzv9vM5gN3mtndwLUENa9yzGHgqAWbCMaeG5aZTQO+BCwxs0+7++DaHOFn/jbA6aef7suXD9mCKyIiwzCzoSoxwKF1FDnL3ZcS1HY+EjbdHe62KYKBV2929yUE1/LcBODueXdfTDBCwjIbONJ5HcHtPYa9FYa7fxnoIRi14GJ3zwy37uFy913ufoO7Lxgq0EREZGKNOdTcfXP4vJ2gm/Cycdh2E7CppBZ2D0HIlW67l2D8vAtKZl8IPOXu24Z7TzN7C8FoC/dSMvJAGTYzcGidueE8ERGZpMYUauHwPVOK0wS3mXhu5K1G3zYceeFVMzsxXP084AUL7jnVHm7TSNB0+FLJbi9nmKbHcJslBE19lwDXANPM7ItlftwngYVmNj+sEV4G3FfmtiIiUgVjranNAv5gZs8ATwD/5e6/AjCzX5jZUeH0TwiG8DnRzDaZ2XUjbRv6KHCbBTdCXEwwUvds4MFw3pPAr939/vA9mglC7mcjlLcJeJ+7rwkHT72Sg0f0HrK84QCtNxKMc/ciwUjlz4/xeImISAVpmKwqUkcREZGxM7MV7n76UMs0ooiIiNQMhZqIiNQM3SRURCTi3J2CQ65QoFAInvMF73+4k8s7BXdyBccdzKAQLssXnEKB/ukD8/qXu3NgXwO2cydfYIh5g5aXLCsUnHcsOpLXHdU27sdCoSYiNadQcPoKBXJ5py9foDdfoC/v9OUKA1/nC+Gjf1lfwcmH2+YLQQj0PxeC53zwuhgS+YKH6xcGrV+yXX7g+oNDp7iP4jqFwqB1fYjtSuZHzbxpzQo1EZl8+vIFevrydPflyfYF09lcEBy9uZJHyevsQcvyA9bLhtN9JeFz0Ot8f2j1Lw9e5yr0I59MGMmEkUoYSTOSyXA6YaQSif5lQz0seK5LJQbu58AjQdIInhP9z6lEgoQZqaQFzwkjMWjbVMJGXKcoEZah+JxM9M9LWrBN6fKElSwv3c6MRIKB25Vsf2B5uL6ZjXBUD49CTaQG5fIFOnvzdGZzwaNkOpsrkAtrDrmC05vrD6WeMJSKr7t78/TkCvT05unJFV/n6e4tkA3XGY8ASSaMumSCulT4SCaoTyVIh/PSSSOdTNBcnyKd7H89eLp03XQy2E86aaRTCdKJBOnU4GX9y4uvkwkjnTw4mEpDofg8kT/OcmgUaiKTQG+ugONkenJksjn2h8+ZnhydvQNfZw4EVY7ObBBWmWyOrt7+6WyucEjlqEsmaEgnaKxL0pBO0phOUp9O0phOcERzHY3twfzisoZ0InxO0lCXpCGVoCGdPBBO9aVBFQZH/7L+9UprDyKHQ6EmMo7cnUw2x65MLzszWXZ19tLR1ceerl72dvext6uXvcXXXX3Bo7uXnr7yQihh0FyforkuRXN9kpb6FE11KeZObaKlPhksK1lefN1SnwznpahPJUglE6QSQfNUOtkfTAoXiTqFmkgZevrybNvXw5aOHrbt62FnppddmWwQXGGA7Qyfh6slpRJGe1MdU5vStDelmTu1iVPnBNNTGtIkDKY0pGmpT9HSkGJKMZDC6ZaGFI3ppJq8REagUJNYy+UL7O7sZfv+LDv2Z9m+v4etHVm27utma0d/iO3p6jto22TCmNZcx/SWeqa11LFgRgvTp9QPmDe9pZ72pjTtTXU01ymQRCaaQk1qmruzq7OXdTs7WbejkzU7M6zb0VlS48oyVD+H6S11zGptYO7URt4wbyqz2xqY1drAkeHzjJZ62hrTJNRcJzKpKNSkJmSyOV7etp+Xt+7npa372by3m+37s6zbkWFfT+7AeumkMW9aM3OnNnLK7FZmtdYzo7WBmVPqmTGlnhkt9cxsrac+lazipxGRQ6VQk8jpzOZ4Ycs+Vm7q4NlNe1m5uYO1OzoPLG+qS3JUeyOzWuu5ePFRHDe9hfkzmlkwvYWj2htIJTU6nEitUqjJpLelo5sn1u3m8XW7Wb5+N6u3Zw40Gc5ua+DUOW28e/EcTprdyomzpjB3aqOaBUViSqEmk87uzl4eeXkHv39lJ0+s38Wru7sBmFKfYum8qVy4aDavn9vGqXPbmDmlocqlFZHJRKEmVefuPP/aPn774nYeenk7T7+6F3eY2pTmjPnTuObN81k2/whOnt2q66hEZEQKNakKd+eFLfu4f+UW/mvlFjbu7sIMXj+3nY+dt5C3njiTU+e0qRlRRMZEoSYVtTOT5Z4Vm7hr+aus3dFJMmG8ecE0PvLWBbzt5FlMa6mvdhFFJMIUajLh3J1H1+7ix49t4IHnt5ErOG88diofOOs43vE6BZmIjB+FmkyYfMH51XNb+dYja1i5qYOpTWmuevOxXL7saI6fOaXaxRORGqRQk3G3K5Pl5ofW8N0/rAPg2GlN/OO7T+U9S+fQkNZFzSIycRRqMm768gVu/eN6vvrbV+jqzXHW8dO54oxjOP91R6rXoohUhEJNxsUfV+/kc/c9z+rtGd564gw+886T1cQoIhWnUJPD0pnN8aVfvMjtj29k3rQmvnfV6Zx38qxqF0tEYkqhJofsqY17+PgdT/Pqni6uP/s4PvH2E3TOTESqSqEmh+T2xzfyufueY1ZrA3defybL5h9R7SKJiCjUZGz68gU++/Pn+ckTGznnhBl89bLFtDfVVbtYIiKAQk3GoKcvz423P8VvXtzOh89dwCfPP1G9GkVkUlGoSVn29/Rx3Q+W8+T63fzDpYv46zfNq3aRREQOolCTUfX05fnAD5azYsMevnrZEi4+7ahqF0lEZEi6BbCMqC9f4Mbbn+KJ9bv51/edpkATkUlNoSYj+sJ/Ps9vXtzO31+yiEsWz6l2cURERqRQk2Hd8cRGfvzYRj509nE6hyYikaBQkyE98+pePvvz53nLwul86oKTql0cEZGyKNTkID19eT5x19NMb6nj3y9fom77IhIZ6v0oB/nKA6tYs6OTH193hi6sFpFIUU1NBli+fjff/cM6rjjjGM5aOL3axRERGROFmhyQzeX523tWMqe9kU9fdHK1iyMiMmZqfpQDfvToBtbt7OTWa95IS73+NEQkelRTEwD2dvXy779bzdknzODcE2dWuzgiIodEoSYA/MdDa9jf08f/uUjd90UkuhRqwp7OXn706AYuXTyHk45srXZxREQOmUJN+P6f1tPdl+fD5y6odlFERA6LQi3mMtkcP/jTes4/ZRYLZ02pdnFERA6LQi3mfrpiEx3dfaqliUhNUKjFmLtz++MbOXVOG0uOmVrt4oiIHDaFWow9tXEPq7bt5/1nHFPtooiIjAuFWozd9vhGWupTuvGniNQMhVpMdfXm+OWzW/mr046iWaOHiEiNUKjF1G9f3E53X161NBGpKQq1mLp/5WvMnFLPsvlHVLsoIiLjRqEWQ/t7+nhw1Q7e+frZugGoiNQUhVoMPfzyDnpzBS46dXa1iyIiMq4UajH04Es7aG9Ks1TXpolIjVGoxUyh4Dz88nbOXjhDTY8iUnMUajHz7OYOdmZ6eetJM6pdFBGRcadQi5kHV23HDM5eqFATkdqjUIuZx9bu4nVHtTKtpb7aRRERGXcKtRjJ5vL8eeNelh07rdpFERGZEAq1GHl2UwfZXEEXXItIzVKoxcgT63cD8MZj1ZVfRGqTQi1Gnli3m+Nntuh8mojULIVaTBQKzooNe3jjsWp6FJHapVCLifW7Otnfk2PJ0e3VLoqIyIRRqMXEs5s7AFg0p63KJRERmTi6O2QErd6eYe2ODMfNaOb4mVPK2ubZTR3UpRIsnNUywaUTEake1dQi6P6Vr3H9j1Zw1S1Plr3Ns5s7OHl2K+mkvnIRqV36hYug9y87hqXHtLOvp6+s9QsF5/nX9vF6NT2KSI1TqEXQzNYGFs1pI1XmKPsbdneRyeZYNKd1gksmIlJdCrUI8zLXW7V1PwAnHalQE5HaplCLKAO8zFRbvT0ItQUz1UlERGqbQi2izMq/wecr2zPMaW+kpV6dXUWktinUIszLrKq9si3D8aqliUgMKNQirJxIyxecNTsyLFSoiUgMKNQiqtzWx017usjmCpwwq7yLtEVEokyhFmVlVNVe2ZYB4HiNJCIiMaBQiyjDymp+XLezE4AF0xVqIlL7FGoRZVZeR5ENuztpa0zT1pSuQKlERKpLoRZR5Xbo37i7m2OOaJrQsoiITBYKtQgrp/lx465OjpmmUBOReFCoRVTQ/DjyOvmCs2mPamoiEh8KtYgqZ0SRLR3d5ArOPIWaiMSEQi3CfJQGyI27ugBUUxOR2FCoRVQ5Axpv3B2E2tEKNRGJCYVaVJXR/XHz3m4SBrPbGia+PCIik4BCLcJG6/24taOHmVMaSCX1NYtIPOjXLqIMGzXVtu7r4UjV0kQkRhRqEVXOgMZbOnrU9CgisaJQi7DRej9u7VBNTUTiRaEWUaP1ftzf00cmm1NNTURiRaEWUTbKKbWtHT0AzGpVqIlIfCjUIspG6dO/dV8QarPbGitRHBGRSUGhFmEj3XpmS0cx1FRTE5H4UKhFVLnNjzNb6ytTIBGRSUChFlGj9ejfmcnS3pSmPpWsSHlERCYDhVqEjdT7cVeml2nNdZUrjIjIJKBQi6pRrr7emckyrUVNjyISLwq1iBqt+XFXZy/TW1RTE5F4UahF3HA9IHdlskxrVk1NROJFoRZRxdbHoTItly+wp6uP6Wp+FJGYUahF1EgXX+/u6gVgmpofRSRmFGoRN1Tj465MEGo6pyYicaNQi6j+5seDY21nJgug3o8iEjsKtYgqNj6OVFPTdWoiEjcKtYga6TI11dREJK4UahE3VO/HXZ29pJNGa0Oq8gUSEakihVpEWVhVG+ru13s6e5naVHdgHRGRuFCo1aCO7j7am9LVLoaISMUp1CJuqObHvV19tDUq1EQkfhRqETVSy2JHt0JNROJJoRZRxRFF8oWDq2od3X20KtREJIYUahG1dkcGgK/99pWDlnV099HeqGvURCR+FGoRtXlvNwArNuwZMD+XL5DJ5tT8KCKxpFCLqFw+aHZMJwd+hft6cgC0NeoaNRGJH4VaRPXmCwCkkgN7jOwNR+hvU5d+EYkhhVpE5QpBqA2uqXV09wHonJqIxJJCLaKKzY+pxMCaWjHU1PtRROJIoRZRfWHzYzo1dE1NHUVEJI4UahHVV+woMkxNTaEmInGkUIuo4kXXycTAr3Bvl0JNROJLoRZRB5ofB/V+fG1vN9Nb6qhL6asVkfjRL19E9Q3TpX93Zy/TmnVzUBGJJ4VaRB19RBMAzfUDL7Lu6s3TXJ+sRpFERKpOoRZRX7tsCQCzWxsGzM9kcwcFnYhIXCjUIqrYEWTwGP2d2RzNdQo1EYknhVpEFe+nNvgmoZ2qqYlIjCnUIqp4P7XBNbVMNkeLzqmJSEwp1KLqQE2tP9bcPewoopqaiMSTQi2izA6el80VyBVcoSYisaVQi6hippWeU+vMBvdSa65T86OIxJNCLaLMiufU+lOtM5sHDr52TUQkLhRqETVUTS0T1tRaFGoiElMKtYg60KW/ZF5Xb9j8qFATkZhSqEXUgS79Q9TUFGoiElcKtYjqr6kdfE5NzY8iElcKtYj78q9WHZgu9n5sUu9HEYkphVpEJYa4UE0dRUQk7hRqETXUxdedOqcmIjGnUIuoITKNzt48dcmE7notIrGlX7+IsiGqasEI/TqfJiLxpVCLqCFratkcTbqXmojEmEItooY6pxbcdkahJiLxpVCLqCGbH3vV/Cgi8aZQqyGdWd1LTUTiTaFWQzrV/CgiMadQqyHqKCIicadQqyFBRxGdUxOR+FKo1Qh3p7NX59REJN4UajUimyuQL7hCTURiTaEWYe8/4ximt9QB/eM+qqOIiMSZQi3CjP6bhBbvpabbzohInCnUIixhduAWobrtjIiIQi3SzKAQVtU6e3XbGRERhVqEJcwOND9mdC81ERGFWtQVa2pd4Tk1NT+KSJwp1CJuf09QQ+u/67U6iohIfCnUIuzWP60HYO2OTH/zo4bJEpEYU6jVgK37ekpqago1EYkvhVoNSCcTZHpz1CUT1KX0lYpIfOkXsAakkwm6snmdTxOR2FOo1YB00ujM5tT0KCKxp1CrAelkgkw2p04iIhJ7CrUakDCjszen5kcRiT2FWoR9+NwF4ZSTyepeaiIiCrUIO2V2KxCM1N+ZzWk0ERGJPYVahJkFzzv2Z9nb1aeamojEnn4FI8wIUu39330cgGbdS01EYk41tQgr1tSKGtX7UURiTqEWYYMyjca0amoiEm8KtRrSWKevU0TiTb+CEXZQ86NqaiIScwq1SBuYajqnJiJxp1CLMNXUREQGUqhF2EEdRXROTURiTr+CEWaDqmoNqqmJSMwp1CJMXfpFRAZSqEXYwRdfK9REJN4UajVENTURiTuFWoSppiYiMpBCLcJs8HVqqqmJSMwp1KJsUE1NvR9FJO4UahE2uPdjOqmvU0TiTb+CETb4OjURkbhTqEWYIk1EZCCFWoSpoiYiMpBCLcIG934UEYk7hVqNuOeGM6tdBBGRqlOoRVix+fGkI6dw+rFHVLcwIiKTgEItwoqNj+rKLyIS0K9hlIWplkzo3JqICCjUIq3YUSSdVKiJiIBCLdLcHVBNTUSkSKEWYblCEGo6pyYiEtCvYYTlCgUAUqqpiYgACrVI68sHNbWUamoiIoBCLdJyxVBTTU1EBFCoRdqB5kfV1EREAIVapKmmJiIykEItwvIFdekXESmlUIuwvrD5URdfi4gEFGoRlghHNG5IJ6tcEhGRySFV7QLIoXvv0rms39nJR89bWO2iiIhMCgq1CKtLJfj0RSdXuxgiIpOGmh9FRKRmKNRERKRmKNRERKRmKNRERKRmKNRERKRmqPfjODGzZuA/gF7gIXe/rcpFEhGJnXGpqZnZLWa23cyeG2GddjO7x8xeMrMXzezMkbY3swYze8LMnjGz583sC+Xsb6T3Ga/PZWYXmNkqM1ttZjeFs98D3OPuHwQuPtT3FBGRQzdezY+3AheMss5XgV+5+0nAacCLo2yfBf7S3U8DFgMXmNmbytjfSO+Dmc00symD5h1f7ucysyTwDeBC4BTgcjM7BZgLvBqulh9mfyIiMoHGpfnR3R8xs2OHW25mbcDZwNXh+r0EzXTDbu/uDmTCl+nw4SPtb7T3CZ0D3GBmF7l71sw+SFDLurDMz7UMWO3ua8Oy3AFcAmwiCLanGeE/C2Z2PXB9+DJjZquGW3cU04Gdh7jtRFK5xkblGpvJWi6YvGWrxXLNG25Bpc6pzQd2AN83s9OAFcDH3L1zpI3CWtEK4HjgG+7++Ej7K+d93P1uM5sP3GlmdwPXAm8fw2eZQ3+NDIIwOwP4GvB1M3sn8J/Dbezu3wa+PYb3G5KZLXf30w93P+NN5RoblWtsJmu5YPKWLW7lKqv50cx+Y2bPDfG4pMz3SQFLgZvdfQnQCdw08ibg7nl3X0xQA1pmZotG2V9Z7+PuXwZ6gJuBi909M3idsXL3Tne/xt0/rE4iIiLVUVaoufvb3H3REI+fl/k+m4BNJTWtewjCpyzuvhd4kP7zW8Ptr6z3MbO3AIuAe4HPlVuO0Gbg6JLXc8N5IiJSZRW5Ts3dtwKvmtmJ4azzgBdG2sbMZphZezjdSNBE+NJI+yvnfcxsCUHz3yXANcA0M/viGD7Ok8BCM5tvZnXAZcB9Y9h+vBx2E+YEUbnGRuUam8laLpi8ZYtVuSzoj3GYOzH7CXAuwYm/bcDn3P17ZvYL4APu/pqZLQa+C9QBa4Fr3H3PcNsThMcPgCRB+N7l7n9f8p5D7m+k9wm3+wtgn7s/G75OA1e7+3fG8LkuAv4tLNst7v6lwzuCIiIyHsYl1ERERCYDDZMlIiI1Q6EWQcOMaFKp9z7azB40sxfCkV4+Fs7/vJltNrOnw8dFJdt8OizrKjN7xwSWbb2ZPRu+//Jw3hFm9mszeyV8nhrONzP7WliulWZWdselMZbpxJJj8rSZ7TOzj1fjeA01Qs6hHB8zuypc/xUzu2qCyvUvFowKtNLM7i05v36smXWXHLdvlmzzhvD7Xx2W3SagXGP+3sb73+sw5bqzpEzrzezpcH4lj9dwvw2V/Rtzdz0i9CA4j7cGOI7gvOEzwCkVfP/ZwNJwegrwMsHIKp8HPjnE+qeEZawnuI5wDZCcoLKtB6YPmvdl4KZw+ibgn8Ppi4BfAga8CXi8Qt/dVoILRyt+vAgGJlgKPHeoxwc4guBc9RHA1HB66gSU63wgFU7/c0m5ji1db9B+ngjLamHZL5yAco3pe5uIf69DlWvQ8q8An63C8Rrut6Gif2OqqUXPgRFNPBgxpTiiSUW4+xZ3fyqc3k8wDNmcETa5BLjD3bPuvg5YTfAZKuUSgg5HhM+Xlsz/oQceA9rNbPYEl+U8YI27bxhhnQk7Xu7+CLB7iPcby/F5B/Brd9/tQQesXzP6EHljLpe7P+DuufDlYwSXzgwrLFuruz/mwS/jD0s+y7iVawTDfW/j/u91pHKFta33AT8ZaR8TdLyG+22o6N+YQt8l+X8AAALqSURBVC16hhrRZKRQmTAWDCG2BCheF3hj2IxwS7GJgcqW14EHzGyFBcORAcxy9y3h9FZgVhXKVXQZA39sqn28YOzHpxrH7VqC/9EXzTezP5vZwxZcc0pYhk0VKtdYvrdKH6+3ANvc/ZWSeRU/XoN+Gyr6N6ZQk0NiZi3AT4GPu/s+gtFZFhAMPr2FoAmk0s5y96UE43h+xMzOLl0Y/o+0Kt19Lbim8WLg7nDWZDheA1Tz+AzHzD4D5IDiKD1bgGM8GDHoE8DtZtZawSJNuu9tkMsZ+B+nih+vIX4bDqjE35hCLXqqPqKJBdf2/RS4zd1/BuDu2zwY1qwAfIf+JrOKldfdN4fP2wlGi1kGbCs2K4bP2ytdrtCFwFPuvi0sY9WPV2isx6di5TOzq4F3AVeEP4aEzXu7wukVBOerTgjLUNpEOSHlOoTvrZLHK0UwOPudJeWt6PEa6reBCv+NKdSip6ojmoRt9t8DXnT3fy2ZX3o+6t1AsWfWfcBlZlZvwUDSCwlOUI93uZotvKWQBTdsPT8sw31AsffUVUBxaLf7gCvDHlhvAjpKmkgmwoD/QVf7eJUY6/H5b+B8M5saNr2dH84bV2Z2AfApgrFZu0rmz7BgoHPM7DiC47M2LNs+M3tT+Dd6ZclnGc9yjfV7q+S/17cBL7n7gWbFSh6v4X4bqPTf2OH0dtGjOg+CXkMvE/yv6zMVfu+zCJoPVhLcZufpsDw/Ap4N598HzC7Z5jNhWVdxmD2sRijXcQQ9y54Bni8eF2Aa8FvgFeA3wBHhfCO4L96asNynT+AxawZ2AW0l8yp+vAhCdQvQR3Ce4rpDOT4E57hWh49rJqhcqwnOqxT/xr4Zrvve8Pt9GngK+KuS/ZxOEDJrgK8TDi4xzuUa8/c23v9ehypXOP9W4IZB61byeA3321DRvzGNKCIiIjVDzY8iIlIzFGoiIlIzFGoiIlIzFGoiIlIzFGoiIlIzFGoiIlIzFGoiIlIz/j/xG1qhlWY7vQAAAABJRU5ErkJggg==\n", 392 | "text/plain": [ 393 | "
" 394 | ] 395 | }, 396 | "metadata": { 397 | "needs_background": "light" 398 | } 399 | } 400 | ] 401 | } 402 | ] 403 | } 404 | -------------------------------------------------------------------------------- /Reinforcement_Learning_solving_a_simple_4_4_Gridworld_using_Q_learning.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import copy 3 | import matplotlib.pyplot as plt 4 | class GridWorld: 5 | def __init__(self): 6 | # S O O O 7 | # O O O * 8 | # O * O O 9 | # O * 0 T 10 | self.qTable = None 11 | self.actionSpace = ('U', 'D', 'L', 'R') 12 | self.actions = { 13 | (0, 0): ('D', 'R'), 14 | (0, 1): ('L', 'D', 'R'), 15 | (0, 2): ('L', 'D', 'R'), 16 | (0, 3): ('L', 'D'), 17 | (1, 0): ('U', 'D', 'R'), 18 | (1, 1): ('U', 'L', 'D', 'R'), 19 | (1, 2): ('U', 'L', 'D', 'R'), 20 | (1, 3): ('U', 'L', 'D'), 21 | (2, 0): ('U', 'D', 'R'), 22 | (2, 1): ('U', 'L', 'D', 'R'), 23 | (2, 2): ('U', 'L', 'D', 'R'), 24 | (2, 3): ('U', 'L', 'D'), 25 | (3, 0): ('U', 'R'), 26 | (3, 1): ('U', 'L', 'R'), 27 | (3, 2): ('U', 'L', 'R') 28 | } 29 | self.rewards = {(3, 3): 0.5, (1, 3): -0.5, (2, 1):-0.5, (3, 1):-0.5} 30 | self.explored = 0 31 | self.exploited = 0 32 | self.initialQtable() 33 | 34 | def initialQtable(self): 35 | self.qTable = {} 36 | for state in self.actions: 37 | self.qTable[state]={} 38 | for move in self.actions[state]: 39 | self.qTable[state][move]=0 40 | print(self.qTable) 41 | 42 | def updateQtable(self, newQ,updateRate=0.05): 43 | for state in self.qTable: 44 | for action in self.qTable[state]: 45 | self.qTable[state][action] = self.qTable[state][action]+(updateRate*(newQ[state][action]-self.qTable[state][action])) 46 | 47 | def getRandomPolicy(self): 48 | policy = {} 49 | for state in self.actions: 50 | policy[state] = np.random.choice(self.actions[state]) 51 | return policy 52 | 53 | def reset(self): 54 | return (0, 0) 55 | 56 | def is_terminal(self, s): 57 | return s not in self.actions 58 | 59 | def getNewState(self,state,action): 60 | i, j = zip(state) 61 | row = int(i[0]) 62 | column = int(j[0]) 63 | if action == 'U': 64 | row -= 1 65 | elif action == 'D': 66 | row += 1 67 | elif action == 'L': 68 | column -= 1 69 | elif action == 'R': 70 | column += 1 71 | return row,column 72 | 73 | def chooseAction(self, state, policy, exploreRate=0.01): 74 | if exploreRate > np.random.rand(): 75 | self.explored += 1 76 | return np.random.choice(self.actions[state]) 77 | self.exploited += 1 78 | return policy[state] 79 | 80 | def move(self, state, policy, exploreRate): 81 | action = self.chooseAction(state, policy, exploreRate) 82 | row,column=self.getNewState(state,action) 83 | if (row, column) in self.rewards: 84 | return action,(row, column),self.rewards[(row, column)] 85 | return action,(row, column),-0.01 86 | 87 | def printPolicy(self, policy): 88 | line = "" 89 | counter = 0 90 | for item in policy: 91 | line += f" | {policy[item]} | " 92 | counter += 1 93 | if counter > 3: 94 | print(line) 95 | print("----------------------------") 96 | counter = 0 97 | line = "" 98 | print(line) 99 | print("----------------------------") 100 | env= GridWorld() 101 | policy = env.getRandomPolicy() 102 | # policy = {(0, 0): 'R', (0, 1): 'R', (0, 2): 'D', (0, 3): 'L', (1, 0): 'U', (1, 1): 'R', (1, 2): 'D', (1, 3): 'D' 103 | # ,(2, 0): 'D', (2, 1): 'R', (2, 2): 'R', (2, 3): 'D', (3, 0): 'R', (3, 1): 'R', (3, 2): 'R'} 104 | # env.printPolicy(policy) 105 | averageRewards=[] 106 | rewards=[] 107 | alpha=0.1 108 | for i in range(1,2002): 109 | state = env.reset() 110 | stepCounts=0 111 | episodeReward=0 112 | while (not env.is_terminal(state)) and (stepCounts<20): 113 | action, nextState, reward = env.move(state, policy,0.01) 114 | stepCounts += 1 115 | targetQ=reward 116 | episodeReward+=reward 117 | if not env.is_terminal(nextState): 118 | targetQ=reward+(0.9*env.qTable[nextState][max(env.qTable[nextState], key=env.qTable[nextState].get)]) 119 | env.qTable[state][action]=env.qTable[state][action]+alpha*(targetQ-env.qTable[state][action]) 120 | state = nextState 121 | rewards.append(episodeReward) 122 | averageRewards.append(sum(rewards)/i) 123 | for state in policy: 124 | policy[state] = max(env.qTable[state], key=env.qTable[state].get) 125 | if (i-1)%200==0: 126 | print(f"\n\n\n step:{i-1} - Average reward so far:{sum(rewards)/i}") 127 | print(f"\n\n\n step:{i-1}") 128 | env.printPolicy(policy) 129 | print("\n") 130 | print(f"exploited:{env.exploited} explored:{env.explored}") 131 | 132 | plt.title(f'Total Rewards') 133 | plt.yscale('symlog') 134 | plt.plot(rewards) 135 | plt.savefig("Total Rewards",dpi=200) 136 | plt.clf() 137 | plt.title(f'Average Rewards') 138 | plt.yscale('symlog') 139 | plt.plot(averageRewards) 140 | plt.savefig("Average Rewards",dpi=200) 141 | -------------------------------------------------------------------------------- /Total Rewards.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Elktrn/Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python/09b4b46e556f1b313bb6c0f6908234279cdc896c/Total Rewards.png -------------------------------------------------------------------------------- /a simple 4 by 4 Gridworld.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Elktrn/Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python/09b4b46e556f1b313bb6c0f6908234279cdc896c/a simple 4 by 4 Gridworld.png -------------------------------------------------------------------------------- /simple 4 by 4 Gridworld.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Elktrn/Reinforcement-Learning-solving-a-simple-4by4-Gridworld-using-Qlearning-in-python/09b4b46e556f1b313bb6c0f6908234279cdc896c/simple 4 by 4 Gridworld.png --------------------------------------------------------------------------------