├── .gitignore ├── README.md ├── ch01_introduction ├── images │ └── backup.png └── introduction.ipynb ├── ch02_multi-armed_bandits ├── images │ ├── a_simple_bandit_algorithm.png │ ├── multi-armed_bandit.jpg │ └── one-armed_bandit.png └── multi-armed_bandits.ipynb └── ch03_finite_markov_decision_processes ├── finite_markov_decision_processes.ipynb └── images ├── backup_diagram.png ├── example3.3.png ├── exercise3.19.png ├── fig3.1.png ├── figure3.2.png ├── figure3.4.png └── figure3.5.png /.gitignore: -------------------------------------------------------------------------------- 1 | .ipynb_checkpoints 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/README.md -------------------------------------------------------------------------------- /ch01_introduction/images/backup.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch01_introduction/images/backup.png -------------------------------------------------------------------------------- /ch01_introduction/introduction.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch01_introduction/introduction.ipynb -------------------------------------------------------------------------------- /ch02_multi-armed_bandits/images/a_simple_bandit_algorithm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch02_multi-armed_bandits/images/a_simple_bandit_algorithm.png -------------------------------------------------------------------------------- /ch02_multi-armed_bandits/images/multi-armed_bandit.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch02_multi-armed_bandits/images/multi-armed_bandit.jpg -------------------------------------------------------------------------------- /ch02_multi-armed_bandits/images/one-armed_bandit.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch02_multi-armed_bandits/images/one-armed_bandit.png -------------------------------------------------------------------------------- /ch02_multi-armed_bandits/multi-armed_bandits.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch02_multi-armed_bandits/multi-armed_bandits.ipynb -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/finite_markov_decision_processes.ipynb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/finite_markov_decision_processes.ipynb -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/backup_diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/backup_diagram.png -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/example3.3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/example3.3.png -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/exercise3.19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/exercise3.19.png -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/fig3.1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/fig3.1.png -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/figure3.2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/figure3.2.png -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/figure3.4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/figure3.4.png -------------------------------------------------------------------------------- /ch03_finite_markov_decision_processes/images/figure3.5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Curt-Park/reinforcement_learning_an_introduction/HEAD/ch03_finite_markov_decision_processes/images/figure3.5.png --------------------------------------------------------------------------------