Proximal Policy Optimization is Easy with Tensorflow 2 | PPO Tutorial

Proximal Policy Optimization (PPO) has emerged as a powerful on policy actor critic algorithm. You might think that implementing it is difficult, but in fact tensorflow 2 makes coding up a PPO agent relatively simple.

We’re going to take advantage of my PyTorch code for this, as it serves as a great basis to expand on. Simply go to my github and copy the code, and then follow along.

Time stamps:
0:00 Intro
01:17 Code restructure
01:57 PPO Memory
03:05 Network classes
08:41 Agent class
24:39 Main file
25:54 Moment of Truth

