├── Creating A2C.ipynb ├── Experimental ├── Adversarial Actors.ipynb ├── Backwards Replay.ipynb ├── IDK_DQN.py ├── Reducing_complex_MDP_to_simple.pdf ├── ReplayWithRewardToGo.py ├── Reward-to-go compared to Q buffer memory.ipynb ├── Smooth Q-learning.ipynb ├── Strange twin ddpg.ipynb ├── TDEDQN.py └── Twin Actor Critic.ipynb ├── LRL Playing with cartpole.ipynb ├── LRL Pong plots and visuals.ipynb ├── LRL Pong.ipynb ├── LRL ├── A2C.py ├── DDPG.py ├── DQN.py ├── GAE.py ├── OUNoise.py ├── PPO.py ├── QRAAC.py ├── QRDQN.py ├── TRPO.py ├── __init__.py ├── agent.py ├── backwardBufferAgent.py ├── categoricalDQN.py ├── doubleDQN.py ├── drawing_tools.py ├── eGreedy.py ├── inverseModel.py ├── logger.py ├── network_heads.py ├── network_modules.py ├── nstepReplayBuffer.py ├── preprocessing │ ├── __pycache__ │ │ ├── atari_wrappers.cpython-36.pyc │ │ ├── atari_wrappers.cpython-37.pyc │ │ ├── multiprocessing_env.cpython-36.pyc │ │ └── multiprocessing_env.cpython-37.pyc │ ├── atari_wrappers.py │ └── multiprocessing_env.py ├── prioritizedBufferAgent.py ├── replayBuffer.py ├── targetDQN.py ├── twinDQN.py └── utils.py ├── LearningRL - Demo.ipynb ├── README.md ├── Research topics.md ├── Rubik's cube ├── Best results.png ├── GAE on Rubik....ipynb ├── Navigating Rubik.ipynb ├── Rubic invariances.ipynb ├── Rubik.ipynb ├── RubikActionMatrix.npy ├── RubikInvariantsMatrix.npy └── readme.md └── Theory Overview ├── Part 1. Value-based.pdf ├── Part 2. Policy Gradient.pdf ├── Part 3. Advanced topics.pdf └── Readme.md /Creating A2C.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Creating A2C.ipynb -------------------------------------------------------------------------------- /Experimental/Adversarial Actors.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Adversarial Actors.ipynb -------------------------------------------------------------------------------- /Experimental/Backwards Replay.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Backwards Replay.ipynb -------------------------------------------------------------------------------- /Experimental/IDK_DQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/IDK_DQN.py -------------------------------------------------------------------------------- /Experimental/Reducing_complex_MDP_to_simple.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Reducing_complex_MDP_to_simple.pdf -------------------------------------------------------------------------------- /Experimental/ReplayWithRewardToGo.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/ReplayWithRewardToGo.py -------------------------------------------------------------------------------- /Experimental/Reward-to-go compared to Q buffer memory.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Reward-to-go compared to Q buffer memory.ipynb -------------------------------------------------------------------------------- /Experimental/Smooth Q-learning.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Smooth Q-learning.ipynb -------------------------------------------------------------------------------- /Experimental/Strange twin ddpg.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Strange twin ddpg.ipynb -------------------------------------------------------------------------------- /Experimental/TDEDQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/TDEDQN.py -------------------------------------------------------------------------------- /Experimental/Twin Actor Critic.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Experimental/Twin Actor Critic.ipynb -------------------------------------------------------------------------------- /LRL Playing with cartpole.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL Playing with cartpole.ipynb -------------------------------------------------------------------------------- /LRL Pong plots and visuals.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL Pong plots and visuals.ipynb -------------------------------------------------------------------------------- /LRL Pong.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL Pong.ipynb -------------------------------------------------------------------------------- /LRL/A2C.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/A2C.py -------------------------------------------------------------------------------- /LRL/DDPG.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/DDPG.py -------------------------------------------------------------------------------- /LRL/DQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/DQN.py -------------------------------------------------------------------------------- /LRL/GAE.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/GAE.py -------------------------------------------------------------------------------- /LRL/OUNoise.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/OUNoise.py -------------------------------------------------------------------------------- /LRL/PPO.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/PPO.py -------------------------------------------------------------------------------- /LRL/QRAAC.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/QRAAC.py -------------------------------------------------------------------------------- /LRL/QRDQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/QRDQN.py -------------------------------------------------------------------------------- /LRL/TRPO.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/TRPO.py -------------------------------------------------------------------------------- /LRL/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/__init__.py -------------------------------------------------------------------------------- /LRL/agent.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/agent.py -------------------------------------------------------------------------------- /LRL/backwardBufferAgent.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/backwardBufferAgent.py -------------------------------------------------------------------------------- /LRL/categoricalDQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/categoricalDQN.py -------------------------------------------------------------------------------- /LRL/doubleDQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/doubleDQN.py -------------------------------------------------------------------------------- /LRL/drawing_tools.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/drawing_tools.py -------------------------------------------------------------------------------- /LRL/eGreedy.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/eGreedy.py -------------------------------------------------------------------------------- /LRL/inverseModel.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/inverseModel.py -------------------------------------------------------------------------------- /LRL/logger.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/logger.py -------------------------------------------------------------------------------- /LRL/network_heads.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/network_heads.py -------------------------------------------------------------------------------- /LRL/network_modules.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/network_modules.py -------------------------------------------------------------------------------- /LRL/nstepReplayBuffer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/nstepReplayBuffer.py -------------------------------------------------------------------------------- /LRL/preprocessing/__pycache__/atari_wrappers.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/preprocessing/__pycache__/atari_wrappers.cpython-36.pyc -------------------------------------------------------------------------------- /LRL/preprocessing/__pycache__/atari_wrappers.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/preprocessing/__pycache__/atari_wrappers.cpython-37.pyc -------------------------------------------------------------------------------- /LRL/preprocessing/__pycache__/multiprocessing_env.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/preprocessing/__pycache__/multiprocessing_env.cpython-36.pyc -------------------------------------------------------------------------------- /LRL/preprocessing/__pycache__/multiprocessing_env.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/preprocessing/__pycache__/multiprocessing_env.cpython-37.pyc -------------------------------------------------------------------------------- /LRL/preprocessing/atari_wrappers.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/preprocessing/atari_wrappers.py -------------------------------------------------------------------------------- /LRL/preprocessing/multiprocessing_env.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/preprocessing/multiprocessing_env.py -------------------------------------------------------------------------------- /LRL/prioritizedBufferAgent.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/prioritizedBufferAgent.py -------------------------------------------------------------------------------- /LRL/replayBuffer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/replayBuffer.py -------------------------------------------------------------------------------- /LRL/targetDQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/targetDQN.py -------------------------------------------------------------------------------- /LRL/twinDQN.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/twinDQN.py -------------------------------------------------------------------------------- /LRL/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LRL/utils.py -------------------------------------------------------------------------------- /LearningRL - Demo.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/LearningRL - Demo.ipynb -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/README.md -------------------------------------------------------------------------------- /Research topics.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Research topics.md -------------------------------------------------------------------------------- /Rubik's cube/Best results.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/Best results.png -------------------------------------------------------------------------------- /Rubik's cube/GAE on Rubik....ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/GAE on Rubik....ipynb -------------------------------------------------------------------------------- /Rubik's cube/Navigating Rubik.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/Navigating Rubik.ipynb -------------------------------------------------------------------------------- /Rubik's cube/Rubic invariances.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/Rubic invariances.ipynb -------------------------------------------------------------------------------- /Rubik's cube/Rubik.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/Rubik.ipynb -------------------------------------------------------------------------------- /Rubik's cube/RubikActionMatrix.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/RubikActionMatrix.npy -------------------------------------------------------------------------------- /Rubik's cube/RubikInvariantsMatrix.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/RubikInvariantsMatrix.npy -------------------------------------------------------------------------------- /Rubik's cube/readme.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Rubik's cube/readme.md -------------------------------------------------------------------------------- /Theory Overview/Part 1. Value-based.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Theory Overview/Part 1. Value-based.pdf -------------------------------------------------------------------------------- /Theory Overview/Part 2. Policy Gradient.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Theory Overview/Part 2. Policy Gradient.pdf -------------------------------------------------------------------------------- /Theory Overview/Part 3. Advanced topics.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Theory Overview/Part 3. Advanced topics.pdf -------------------------------------------------------------------------------- /Theory Overview/Readme.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FortsAndMills/Learning-Reinforcement-Learning/HEAD/Theory Overview/Readme.md --------------------------------------------------------------------------------