Dask DataFrame groupby. Why it can fail and how to compensate.

Dask DataFrame groupby operations are very common and very powerful. However, due to the distributed nature of Dask DataFrames, they can fail in unexpected ways. This talk covers mitigation strategies for these problems, including using set_index to optimize data layout, and using split_out and split_every parameters to optimize computation.

Categories: Day 3