Policy iteration and value iteration - Policy iteration and value iterations are two very interesting as well as important algorithms in Reinforcement learning.These two algorithms are based on dynamic programming and Bellman equation. Value iteration algorithm and policy iteration algorithm are very useful for finding the optimal policy when the agent knows sufficient details about the environment model. In this video we alo talkabout Bellman optimality equation and optimal value function in reinforcement learning.
Reinforcement learning tutorial series:
1. Multi-armed Bandits: https://youtu.be/_XsIv-35c6o
2. Multi-Armed Bandits - Action value estimation: https://youtu.be/ojjzpDrUppI
3. Upper confidence bound: https://youtu.be/RPbtzWgzD9M
4. Thompson Sampling: https://youtu.be/p701cYQeqew
5. Markov Decision Process - MDP: https://youtu.be/Hm2H97aHTJE
6. Policy iteration and value iteration: https://youtu.be/BAetsPIojg4