article thumbnail

Amazon SageMaker built-in LightGBM now offers distributed training using Dask

AWS Machine Learning

In this post, we discuss how distributed training can be applied to tabular data (a common type of data found in many industries such as finance, healthcare, and retail) using Dask and the LightGBM algorithm for tasks such as regression and classification. For the NYC taxi data, we use the yellow trip taxi records from 2009–2022.