Time Zone: UTC

19 May 15:00 – 19 May 15:30 in Talks

Transforming Terabytes of healthcare data with Dask and Kubeflow pipeline

Michael Sebbah, Anthony Dubois, KRIEF David, Yoann Janvier

Audience level:
Intermediate

Description

As part of the data science team, at Ipsen, we study effect of medical products on patients. Computing on Terabytes of data is a challenge, we will explain why we have opted for Dask along with Kubeflow for that. We will share issues we faced and how we fixed them. Come and spend some time with us and you will delve into a real Dask use case.

Abstract

Ipsen is a leading biopharmaceutical group dedicated to prolonging and improving lives and health outcomes through innovative medicines in oncology, neuroscience and rare disease. As part of the data science team, we study effect of medical products on patients by exploring healthcare databases to gain insights. We faced the challenge to compute terabytes of patient data and now we are happy to share this experience with you.

Come and spend some time with us and you will feel how data engineers and data scientists work together to innovate for patients and society.