Scaling Pandas using Dask: How to avoid all my mistakes

Dask is a Python package that provides advanced parallelism for analytics, enabling performance at scale for the tools you love. People think it’s magic – drop it in and it scales. This will mostly work, but it will not scale well!

We would like to share what we’ve learned about using Dask to scale dataframe and computations, to avoid you making the same mistakes.

Categories: Day 2