Simplified distributed training with tf.distribute parameter servers
Learn about a new tf.distribute strategy, ParameterServerStrategy, which enables asynchronous distributed training in TensorFlow, along with its usage with Keras APIs and custom training loop. If you have models with large embeddings or an environment with preemptible machines, this approach lets you scale your training much more easily with minimum code changes.
Resources:
Distributed training with TensorFlow → https://goo.gle/3bli9n6
Parameter server training → https://goo.gle/2Zz3UrW
Speaker:
Yuefeng Zhou (Software Engineer)
Watch all Google’s Machine Learning Virtual Community Day sessions → https://goo.gle/mlcommunityday-all
Subscribe to the TensorFlow channel → https://goo.gle/TensorFlow
#MLCommunityDay
product: TensorFlow – General; event: ML Community Day 2021; fullname: Yuefeng Zhou; re_ty: Publish;