2021 DeepMind x UCL RL Lecture Series – Policy-Gradient and Actor-Critic methods [9/13]

Research Scientist Hado van Hasselt covers policy algorithms that can learn policies directly and actor critic algorithms that combine value predictions for more efficient learning.

Slides: https://dpmd.ai/policygradient
Full video lecture series: https://dpmd.ai/DeepMindxUCL21

YouTube Source for this AI Video

AI video(s) you might be interested in …