Proximal Policy Optimization is Easy with Tensorflow 2 | PPO Tutorial

Proximal Policy Optimization (PPO) has emerged as a powerful on policy actor critic algorithm. You might think that implementing it is difficult, but in fact tensorflow 2 makes coding up a PPO agent relatively simple.

We’re going to take advantage of my PyTorch code for this, as it serves as a great basis to expand on. Simply go to my github and copy the code, and then follow along.

Code for this video is here:
https://github.com/philtabor/Youtube-Code-Repository/tree/master/ReinforcementLearning/PolicyGradient/PPO/tf2

Learn how to turn deep reinforcement learning papers into code:

Deep Q Learning:
https://www.udemy.com/course/deep-q-learning-from-paper-to-code/?couponCode=DQN-AUG-2021

Actor Critic Methods:
https://www.udemy.com/course/actor-critic-methods-from-paper-to-code-with-pytorch/?couponCode=AC-AUG-2021

Natural Language Processing from First Principles:
https://www.udemy.com/course/natural-language-processing-from-first-principles/?couponCode=NLP1-AUG-2021

Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/reinforcement-learning-in-motion

Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion: https://bit.ly/3fXHy8W
Grokking Deep Learning: https://bit.ly/3yJ14gT
Grokking Deep Reinforcement Learning: https://bit.ly/2VNAXql

Come hang out on Discord here:
https://discord.gg/Zr4VCdv

Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: https://twitter.com/MLWithPhil

Time stamps:
0:00 Intro
01:17 Code restructure
01:57 PPO Memory
03:05 Network classes
08:41 Agent class
24:39 Main file
25:54 Moment of Truth

Source of this AI Video

AI video(s) you might be interested in …