├── .DS_Store
├── scripts
├── .DS_Store
├── main.py
├── maze_environment.py
└── maze_solver.py
├── programming assignment adprl.pdf
├── gundogan_alperen_03694565_report.pdf
├── maze.txt
└── README.md
/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gundoganalperen/adp_rl/HEAD/.DS_Store
--------------------------------------------------------------------------------
/scripts/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gundoganalperen/adp_rl/HEAD/scripts/.DS_Store
--------------------------------------------------------------------------------
/programming assignment adprl.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gundoganalperen/adp_rl/HEAD/programming assignment adprl.pdf
--------------------------------------------------------------------------------
/gundogan_alperen_03694565_report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gundoganalperen/adp_rl/HEAD/gundogan_alperen_03694565_report.pdf
--------------------------------------------------------------------------------
/maze.txt:
--------------------------------------------------------------------------------
1 | # This is the definition of a maze
2 | # Lines starting with # must be ignored
3 | # 1: Wall 0: Free
4 | # S: Start G: Goal T: Trap
5 | #
6 | 1 1 1 1 1 1 1 1 1 1
7 | 1 0 0 0 0 0 1 0 0 1
8 | 1 0 1 1 1 0 0 0 0 1
9 | 1 0 1 T 1 0 1 0 S 1
10 | 1 0 0 0 0 0 1 0 0 1
11 | 1 0 1 1 1 1 1 1 1 1
12 | 1 0 0 0 0 0 0 0 G 1
13 | 1 1 1 1 1 1 1 1 1 1
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # adp_rl
2 | Approximate Dynamic Programming and Reinforcement Learning - Programming Assignment
3 |
4 | The purpose of this assignment is to implement a simple environment and learn to make optimal decisions inside a maze by solving the problem with Dynamic Programming. Value Iteration(VI) and Policy Iteration(PI) i.e. Policy Evaluation, Policy Improvement methods are implemented and analyzed.
5 |
6 | Run the `python main.py /absolute/path/to/maze.txt` command to launch the application.
7 |
--------------------------------------------------------------------------------
/scripts/main.py:
--------------------------------------------------------------------------------
1 | """
2 | Approximate Dynamic Programming & Reinforcement Learning - WS 2018
3 | Programming Assignment
4 | Alperen Gundogan
5 |
6 | 30.01.2019
7 |
8 | Command necessary to test the code.
9 | python main.py maze.txt
10 | """
11 | from __future__ import division
12 | import sys
13 | import os
14 | from maze_environment import Maze_env
15 | from maze_solver import Maze_solver
16 |
17 | import matplotlib.pyplot as plt
18 | from matplotlib.pyplot import figure
19 |
20 | MAX_ITER = 1000
21 |
22 | if __name__ == '__main__':
23 | if len(sys.argv) < 2:
24 | print('Arguments: