Challenge
The University of York are committed to enhancing their position as one of the world’s premier institutions for inspirational and life-changing research. Previously using high-performance computing (HPC), the research team wanted to explore the possibilities of the cloud.
Professor James Chong, Royal Society Industry Fellow at the University of York, has expertise in biology and studies the dynamics of anaerobic microbial communities to understand how to improve and make the processing of sewage sludge and waste treatment more efficient, by recovering resources and reducing the emission of greenhouse gases that harm the environment.
To do this, gigabases of DNA sequence from mixed microbial communities are collected by Prof Chong and his group, who then work with colleagues Dr John Davey (Bioinformatician) and Dr Peter Ashton, Head of the Genomics and Bioinformatics Laboratory (both in the University’s Bioscience Technology Facility), to analyse the data on their HPC clusters.
Ashton and his lab use nanopore sequencing to produce “long reads” containing hundreds of thousands of DNA base pairs, and Davey assembles these reads by comparing overlapping sections to piece them together.
The researchers perform “metagenome assembly”, which requires a large amount of RAM and disk space. One dataset which was 60 gigabases in size had failed to assemble on a small local cluster. After several unsuccessful attempts, the team spoke to Google who pointed them in the direction of CTS, who then helped them to pilot their workflow on Google Compute Engine’s virtual machines (VMs). CTS are able to offer unique services to the European Research and Education community due to their partnership with Google Cloud and GÉANT.