WHY AND HOW OF SCALING LARGE LANGUAGE MODELS | NICHOLAS JOSEPH

Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. Over the past decade, the amount of compute used for the largest training runs has increased at an exponential pace. We’ve also seen in many domains that larger models are able to attain better performance following precise scaling laws. The compute needed to train these models can only be attained using many coordinated machines that are communicating data between them. In this talk, Nicholas Joseph (Technical Staff, Anthropic) goes through why and how they can scale up training runs to use these machines efficiently.

Source of this PyTorch AI Video

AI video(s) you might be interested in …