Remove pt-pt call-center-call-recording-best-practices
article thumbnail

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances

AWS Machine Learning

If the model changes on the server side, the client has to know and change its API call to the new endpoint accordingly. In this post, we share best practices to deploy deep learning models with FastAPI on AWS Inferentia NeuronCores. We demonstrate the use of Neuron Top at the end of this blog. 3 Inf1.24xlarge 16 64 1.5

Scripts 73