TensorFlow Serving performance optimization

Wei Wei, Developer Advocate at Google, shares general principles and best practices to improve TensorFlow Serving performance. He discusses how to improve the latency for API surfaces, batching, and more parameters that you can tune.

Resources:
TensorFlow Serving performance guide → https://goo.gle/3zW168E
Profile Inference Requests with TensorBoard → https://goo.gle/3zWjluJ
TensorFlow Serving batching configuration → https://goo.gle/3xT2SVz
TensorFlow Serving SavedModel Warmup → https://goo.gle/3ygfIhT
XLA homepage → https://goo.gle/3zY01gw
How to make TensorFlow models run faster on GPUs (with XLA) → https://goo.gle/3OAB8LR
How OpenX Trains and Serves for a Million Queries per Second in under 15
Milliseconds → https://goo.gle/3NdAOSd
ResNet complete example → https://goo.gle/3zU1PHs

Deploying Production ML Models with TensorFlow Serving playlist → https://goo.gle/tf-serving Subscribe to TensorFlow → https://goo.gle/TensorFlow

#TensorFlow #MachineLearning #ML

Source of this TensorFlow AI Video

AI video(s) you might be interested in …