Reinforcement Learning in the OpenAI Gym (Tutorial) – SARSA

When we last left off, we covered the Q learning algorithm for solving the cart pole problem from the OpenAI Gym. Related to Q learning is the SARSA algorithm, which also performs quite well. SARSA differs from Q learning in that it is an on policy, rather than off-policy reinforcement learning algorithm.

In this video, I’ll go over the SARSA algorithm and show you how to use it to get the cart pole to dance.

