Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances
AWS Machine Learning
JULY 24, 2023
If the model changes on the server side, the client has to know and change its API call to the new endpoint accordingly. In this post, we share best practices to deploy deep learning models with FastAPI on AWS Inferentia NeuronCores. We demonstrate the use of Neuron Top at the end of this blog. 3 Inf1.24xlarge 16 64 1.5
Let's personalize your content