├── .gitignore ├── 0 - Introduction to Gym.ipynb ├── 1 - Vanilla Policy Gradient (REINFORCE) [CartPole].ipynb ├── 1_policy_gradient.ipynb ├── 2 - Actor Critic [CartPole].ipynb ├── 2_q_learning.ipynb ├── 3 - Advantage Actor Critic (A2C) [CartPole].ipynb ├── 3_advantage_actor_critic.ipynb ├── 3a - Advantage Actor Critic (A2C) [LunarLander].ipynb ├── 4 - Generalized Advantage Estimation (GAE) [CartPole].ipynb ├── 4a - Generalized Advantage Estimation (GAE) [LunarLander].ipynb ├── 5 - Proximal Policy Optimization (PPO) [CartPole].ipynb ├── 5a - Proximal Policy Optimization (PPO) [LunarLander].ipynb ├── 8 - n step A2C.ipynb ├── LICENSE ├── README.md ├── checkpoint_viz.ipynb ├── dqn_working.ipynb ├── n_step_a2c.py ├── q_learning.py └── runner.py /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/.gitignore -------------------------------------------------------------------------------- /0 - Introduction to Gym.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/0 - Introduction to Gym.ipynb -------------------------------------------------------------------------------- /1 - Vanilla Policy Gradient (REINFORCE) [CartPole].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/1 - Vanilla Policy Gradient (REINFORCE) [CartPole].ipynb -------------------------------------------------------------------------------- /1_policy_gradient.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/1_policy_gradient.ipynb -------------------------------------------------------------------------------- /2 - Actor Critic [CartPole].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/2 - Actor Critic [CartPole].ipynb -------------------------------------------------------------------------------- /2_q_learning.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/2_q_learning.ipynb -------------------------------------------------------------------------------- /3 - Advantage Actor Critic (A2C) [CartPole].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/3 - Advantage Actor Critic (A2C) [CartPole].ipynb -------------------------------------------------------------------------------- /3_advantage_actor_critic.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/3_advantage_actor_critic.ipynb -------------------------------------------------------------------------------- /3a - Advantage Actor Critic (A2C) [LunarLander].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/3a - Advantage Actor Critic (A2C) [LunarLander].ipynb -------------------------------------------------------------------------------- /4 - Generalized Advantage Estimation (GAE) [CartPole].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/4 - Generalized Advantage Estimation (GAE) [CartPole].ipynb -------------------------------------------------------------------------------- /4a - Generalized Advantage Estimation (GAE) [LunarLander].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/4a - Generalized Advantage Estimation (GAE) [LunarLander].ipynb -------------------------------------------------------------------------------- /5 - Proximal Policy Optimization (PPO) [CartPole].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/5 - Proximal Policy Optimization (PPO) [CartPole].ipynb -------------------------------------------------------------------------------- /5a - Proximal Policy Optimization (PPO) [LunarLander].ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/5a - Proximal Policy Optimization (PPO) [LunarLander].ipynb -------------------------------------------------------------------------------- /8 - n step A2C.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/8 - n step A2C.ipynb -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/README.md -------------------------------------------------------------------------------- /checkpoint_viz.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/checkpoint_viz.ipynb -------------------------------------------------------------------------------- /dqn_working.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/dqn_working.ipynb -------------------------------------------------------------------------------- /n_step_a2c.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/n_step_a2c.py -------------------------------------------------------------------------------- /q_learning.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/q_learning.py -------------------------------------------------------------------------------- /runner.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bentrevett/pytorch-rl/HEAD/runner.py --------------------------------------------------------------------------------