Time Zone: UTC

19 May 13:00 – 19 May 14:00 in Talks, Tutorials / Workshops 2, Tutorials / Workshops 1

Keynote: Clusters of Clusters: Using Dask Distributed to Scale Enterprise Machine Learning Systems

Grant Gelven

Audience level:
Intermediate

Description

The past decade has shown there is a steep learning curve for organizations trying to scale and productionalize ML systems quickly. At Walmart, we have developed several principles over the years that allow us to address this challenge. In this talk, I will discuss these principles and the open-source tools that enable us today.

Abstract

Nearly all enterprises today are looking to leverage data science and machine learning (ML) to transform and sometimes even create the marketplace. The past decade, however, has shown there is a steep learning curve for organizations trying to scale and productionalize ML systems quickly. Even in 2020, some 43% of companies cited scaling as their largest ML Operations issue*. And, more than half of the same enterprises required more than 30 days to get a single ML model into production. At Walmart, we have developed several principles over the years that allow us to scale ML systems quickly. In this talk, I will discuss these principles and the open-source tools that enable us. I’ll highlight the use Dask Distributed to leverage code developed on day-one to scale up local model development, as well as how to quickly scale out ML training of both small and large models, in parallel, for a number of enterprise use cases. After the session, participates will be able to easily recycle demo code to adapt to their own specific applications.