Maximize Stable Diffusion performance and lower inference costs with AWS Inferentia2
AWS Machine Learning
JULY 26, 2023
We compile the UNet for one batch (by using input tensors with one batch), then use the torch_neuronx.DataParallel API to load this single batch model onto each core. Load the UNet model onto two Neuron cores using the torch_neuronx.DataParallel API. If you have a custom inference script, you need to provide that instead.
Let's personalize your content