This workshop will cover the most common methods for deploying Dask today. Starting with an overview of all the moving pieces within a Dask cluster (client, cluster, scheduler, workers), we will then talk through various platforms and the tools used to deploy onto them along with benefits, common challenges, and pitfalls.
This workshop will start out by covering how a Dask cluster is deployed at a low level with manually running a scheduler and some workers and then connecting a client. We will walk through how these pieces find each other and communicate, and how we can find out more information ourselves using the dashboard and utilities in Python.
We will also cover at a high level the three types of deployment that folks tend to do; fixed where a single Dask cluster is installed and run indefinitely, ephemeral where clusters are created and destroyed as part of a single workflow and multi-tenant where clusters are created and managed centrally by Dask Gateway.
We will also have a series of talks to give an overview of platform specific tooling. These talks will outline deploying Dask clusters on Kubernetes, HPC, Cloud, and with Dask Gateway as well as dive into the strengths, challenges and limitations of each platform.
By the end of the workshop, attendees will have an understanding of how Dask clusters are deployed and communicate in general along with some relevant examples for how they can deploy and scale Dask on their infrastructure.
Times in UTC