Time Zone: UTC

19 May 01:30 – 19 May 02:00 in Tutorials / Workshops 2

Dask Down Under: Dask DevOps for Remote Sensing

Tisham Dhar

Audience level:
Novice

Description

This talk covers our DevOps journey with Dask at Geoscience Australia, the architecture of our processing cluster, as well as some of the very expensive lessons we learnt along the way.

Abstract

Processing of long time series of satellite imagery at continental scale requires lots of resources especially RAM. Luckily the patterns in which we use this data is decomposable. We use dask chunking mechanisms to break up image processing tasks across nodes in AWS cluster to crunch through petabytes of imagery over all of Africa and Australia. This talk covers the DevOps journey to get to this stage, the architecture of our processing cluster, as well as shares some of the very expensive lessons we learnt processing using slightly wrong configurations in the cloud environment. We will finish off with the on-going work to improve the developer experience in launching personal dask clusters and attaching dask to web services.