This talk covers our DevOps journey with Dask at Geoscience Australia, the architecture of our processing cluster, as well as some of the very expensive lessons we learnt along the way.
Processing of long time series of satellite imagery at continental scale requires lots of resources especially RAM. Luckily the patterns in which we use this data is decomposable. We use dask chunking mechanisms to break up image processing tasks across nodes in AWS cluster to crunch through petabytes of imagery over all of Africa and Australia. This talk covers the DevOps journey to get to this stage, the architecture of our processing cluster, as well as shares some of the very expensive lessons we learnt processing using slightly wrong configurations in the cloud environment. We will finish off with the on-going work to improve the developer experience in launching personal dask clusters and attaching dask to web services.