Remove 2024 Remove Analytics Remove APIs Remove Calibration
article thumbnail

Boost inference performance for Mixtral and Llama 2 models with new Amazon SageMaker containers

AWS Machine Learning

In January 2024, Amazon SageMaker launched a new version (0.26.0) Be mindful that LLM token probabilities are generally overconfident without calibration. Before introducing this API, the KV cache was recomputed for any newly added requests. Be mindful that LLM token probabilities are generally overconfident without calibration.