Proximal Policy Optimization is Easy with Tensorflow 2 | PPO Tutorial

Proximal Policy Optimization (PPO) has emerged as a powerful on policy actor critic algorithm. You might think that implementing it is difficult, but in fact tensorflow 2 makes coding up a PPO agent relatively simple.

We’re going to take advantage of my PyTorch code for this, as it serves as a great basis to expand on. Simply go to my github and copy the code, and then follow along.

Code for this video is here:

Learn how to turn deep reinforcement learning papers into code:

Deep Q Learning:

Actor Critic Methods:

Natural Language Processing from First Principles:

Reinforcement Learning Fundamentals

Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion:
Grokking Deep Learning:
Grokking Deep Reinforcement Learning:

Come hang out on Discord here:


Time stamps:
0:00 Intro
01:17 Code restructure
01:57 PPO Memory
03:05 Network classes
08:41 Agent class
24:39 Main file
25:54 Moment of Truth

Source of this AI Video

AI video(s) you might be interested in …