Exploding And Vanishing Gradients

Training very deep networks can make your derivatives get very small or very large quickly. This problem is referred to as vanishing or exploding gradients, which makes training unstable. In this video we introduce two flags, track_grad_norm to identify vanishing and exploding gradients, and gradient_clip_val , which will clip the gradient norm computed over all model parameters together.

Follow along with this notebook: https://bit.ly/33YzC1P

GitHub: https://github.com/PyTorchLightning/pytorch-lightning
Lightning Website: https://www.pytorchlightning.ai/
Follow us on Twitter: https://twitter.com/PyTorchLightnin

Source of this PyTorch Lightning Video

AI video(s) you might be interested in …