Memory spilling is an important feature that makes it possible to run Dask applications that would otherwise run out of memory. When low on memory, Dask moves data from GPU memory to main memory and/or data from main memory to disk automatically. In this talk, we will walk through how spilling works in general, its shortcomings, and introduce a new Dask-CUDA approach to overcome these shortcomings
Memory spilling is an important feature that makes it possible to run Dask applications that would otherwise run out of memory. When low on memory, Dask moves data from GPU memory to main memory and/or data from main memory to disk automatically. Dask supports spilling to disk out of the box and external projects such as Dask-CUDA provides spilling of GPU memory to main memory. In this talk, we will walk through how spilling works in general, its shortcomings, and introduce a new Dask-CUDA approach to overcome these shortcomings and provide very efficient accurate spilling. The talk will focus on GPU memory spilling but the approach is general enough to be useful for main memory spilling as well.
This talk is targeted at folks who use spilling and want to avoid memory usage spikes and reduce the spilling overhead. It might also interest people that are interested in the more technical side of a Task execution in Dask.