Frugality meets Accuracy: Cost-efficient training of GPT NeoX and Pythia models with AWS Trainium
AWS Machine Learning
DECEMBER 12, 2023
NeoX 20B is trained on 4 nodes with small wiki dataset on GPU and Trainium with same training hyper-parameters (global batch size=256). is trained on 4 nodes with small wiki dataset on GPU and Trainium with same training hyper-parameters (global batch size=256). A small wiki dataset is used for fine-tuning demonstration.
Let's personalize your content