├── LOG.md ├── README.md ├── lec01 ├── LOG.md ├── greedy_1.png ├── greedy_10.png ├── greedy_100.png ├── greedy_3.png ├── greedy_3_reorder.png ├── optimistic_1.png ├── optimistic_10.png ├── optimistic_100.png ├── optimistic_3.png ├── rewards_1000.png ├── rewards_1000_3.png ├── rewards_20000.png ├── rewards_20000_2.png ├── rewards_20000_3.png ├── t.py ├── ucb1.png └── ucb1_long.png ├── policy_gradient.py ├── sarsa.png ├── sarsa_alphas.png ├── sarsa_illegal.png ├── t.py ├── tictactoe.py └── tiger.py /LOG.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/LOG.md -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/README.md -------------------------------------------------------------------------------- /lec01/LOG.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/LOG.md -------------------------------------------------------------------------------- /lec01/greedy_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/greedy_1.png -------------------------------------------------------------------------------- /lec01/greedy_10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/greedy_10.png -------------------------------------------------------------------------------- /lec01/greedy_100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/greedy_100.png -------------------------------------------------------------------------------- /lec01/greedy_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/greedy_3.png -------------------------------------------------------------------------------- /lec01/greedy_3_reorder.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/greedy_3_reorder.png -------------------------------------------------------------------------------- /lec01/optimistic_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/optimistic_1.png -------------------------------------------------------------------------------- /lec01/optimistic_10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/optimistic_10.png -------------------------------------------------------------------------------- /lec01/optimistic_100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/optimistic_100.png -------------------------------------------------------------------------------- /lec01/optimistic_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/optimistic_3.png -------------------------------------------------------------------------------- /lec01/rewards_1000.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/rewards_1000.png -------------------------------------------------------------------------------- /lec01/rewards_1000_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/rewards_1000_3.png -------------------------------------------------------------------------------- /lec01/rewards_20000.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/rewards_20000.png -------------------------------------------------------------------------------- /lec01/rewards_20000_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/rewards_20000_2.png -------------------------------------------------------------------------------- /lec01/rewards_20000_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/rewards_20000_3.png -------------------------------------------------------------------------------- /lec01/t.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/t.py -------------------------------------------------------------------------------- /lec01/ucb1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/ucb1.png -------------------------------------------------------------------------------- /lec01/ucb1_long.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/lec01/ucb1_long.png -------------------------------------------------------------------------------- /policy_gradient.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/policy_gradient.py -------------------------------------------------------------------------------- /sarsa.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/sarsa.png -------------------------------------------------------------------------------- /sarsa_alphas.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/sarsa_alphas.png -------------------------------------------------------------------------------- /sarsa_illegal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/sarsa_illegal.png -------------------------------------------------------------------------------- /t.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/t.py -------------------------------------------------------------------------------- /tictactoe.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/tictactoe.py -------------------------------------------------------------------------------- /tiger.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/nishio/reinforcement_learning/HEAD/tiger.py --------------------------------------------------------------------------------