Blog - Customer Contact Central

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances

AWS Machine Learning

JULY 24, 2023

A client doesn’t need to know the model’s name or version that has been deployed under the server; the endpoint name is now just a proxy to a function that loads and runs the model. If the model changes on the server side, the client has to know and change its API call to the new endpoint accordingly. 3 Inf1.24xlarge 16 64 1.5

Scripts

Scripts APIs Engineering Best practices

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances

Stay Connected