In this episode, Associate Professor Sarah Kummerfeld, Head of Data Science for the Garvan Institute of Medical Research, joins Jay and Theo to discuss the vital role cloud computing plays in genomic sequencing. Genomic sequencing has been contributing to medical research to improve the understanding and diagnosis of rare diseases for many years. However, sequencing a single genome produces approximately 100GB of data in its raw format, which then needs to be converted into a format that can be analyzed and shared with researchers. This process can take as much as 600 CPU hours per genome. The Garvan Institute knew that processing vast amounts of genomes was far beyond the capabilities of on-prem infrastructure. And during the COVID-19 pandemic, genomic sequencing took on even more importance. Sarah’s team realized that moving to the public cloud was the only way, embarking on a pilot program to process 14,000 genomes. Listen in as Sarah reveals how the team at Garvan Institute reduced the time it took to sequence the virus from a PCR test from two days to about four hours – dramatically speeding up contact tracing and reducing the spread of COVID-19 in the community. She talks through how the team uses a system called Terra, and how it became easier to manage the privacy and security of genomic data in Google Cloud, and the vast capacity required for this data in the region. In fact, Sarah’s team discovered that with Google Cloud, there was enough capacity available within Australia to run a pilot program three times as big. As biology increasingly becomes a data science, generating enormous pools of data, Sarah shares how the Garvan Institute is embracing the huge opportunity machine learning presents to help build and improve the vital genomic infrastructure for Australia.
- Series: Episodes