├── .gitignore ├── Chapter1-初识强化学习 └── 1.6-案例:基于Gym库的智能体-环境交互.ipynb ├── Chapter2-Markov决策过程 ├── 2.2-Bellman期望方程.ipynb ├── 2.3-最优策略及其性质.ipynb └── 2.4-案例:悬崖寻路.ipynb ├── Chapter3-有模型数值迭代 └── 3.5-案例:冰面滑行.ipynb ├── Chapter4-回合更新价值迭代 └── 4.3-案例:21点游戏.ipynb ├── Chapter5-时序差分价值迭代 └── 5.4-案例:出租车调度.ipynb ├── Chapter6-函数近似方法 └── 6.5-案例:小车上山.ipynb ├── Chapter7-回合更新策略梯度方法 └── 7.5-案例:车杆平衡.ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/.gitignore -------------------------------------------------------------------------------- /Chapter1-初识强化学习/1.6-案例:基于Gym库的智能体-环境交互.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter1-初识强化学习/1.6-案例:基于Gym库的智能体-环境交互.ipynb -------------------------------------------------------------------------------- /Chapter2-Markov决策过程/2.2-Bellman期望方程.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter2-Markov决策过程/2.2-Bellman期望方程.ipynb -------------------------------------------------------------------------------- /Chapter2-Markov决策过程/2.3-最优策略及其性质.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter2-Markov决策过程/2.3-最优策略及其性质.ipynb -------------------------------------------------------------------------------- /Chapter2-Markov决策过程/2.4-案例:悬崖寻路.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter2-Markov决策过程/2.4-案例:悬崖寻路.ipynb -------------------------------------------------------------------------------- /Chapter3-有模型数值迭代/3.5-案例:冰面滑行.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter3-有模型数值迭代/3.5-案例:冰面滑行.ipynb -------------------------------------------------------------------------------- /Chapter4-回合更新价值迭代/4.3-案例:21点游戏.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter4-回合更新价值迭代/4.3-案例:21点游戏.ipynb -------------------------------------------------------------------------------- /Chapter5-时序差分价值迭代/5.4-案例:出租车调度.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter5-时序差分价值迭代/5.4-案例:出租车调度.ipynb -------------------------------------------------------------------------------- /Chapter6-函数近似方法/6.5-案例:小车上山.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter6-函数近似方法/6.5-案例:小车上山.ipynb -------------------------------------------------------------------------------- /Chapter7-回合更新策略梯度方法/7.5-案例:车杆平衡.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/Chapter7-回合更新策略梯度方法/7.5-案例:车杆平衡.ipynb -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/quqixun/RL-Python-Pytorch/HEAD/README.md --------------------------------------------------------------------------------