Proximal Policy Optimization (PPO) is Easy With PyTorch | Full PPO Tutorial

Proximal Policy Optimization is an advanced actor critic algorithm designed to improve performance by constraining updates to our actor network. It’s relatively straight forward to implement in code, and in this full tutorial you’re going to get a mini lecture covering the essential concepts behind the ppo algorithm, as well as a complete implementation in the pytorch framework. We’ll test our algorithm in a simple open ai gym environment: the cartpole.

