University of Queensland scientists playing a small but crucial role in a global research collaboration to improve the qualities of sweet potatoes are using high performance computer FlashLite to power their work.
The Bill & Melinda Gates Foundation-funded project, largely involving U.S., African, South American and Australian researchers, is aiming to improve crop growth and the genetic makeup of the sweet potato to help people in Sub-Saharan Africa, one of the world’s poorest regions.
For the project, UQ's Institute for Molecular Bioscience (IMB) is developing a digital platform of genomic, genetic and bioinformatics software tools to help researchers worldwide sequence the sweet potato genome more efficiently.
The sweet potato genome is large and highly complex, with six copies of each chromosome, whereas humans have two copies. Added to the complexity, genome sequencing data needs to be combined from multiple sources and technologies, resulting in a huge volume of data.
IMB, in its project titled ‘Genomic Tools for Sweet Potato Improvement (GT4SP)’, is integrating the different data sets to assemble the sweet potato genome. “It’s the most computationally intensive part of the project,” said Dr Lachlan Coin, GT4SP project lead and an IMB Principal Research Fellow. “As the genome is large, polyploid and highly heterozygous, terabytes of computer disk space and memory is required to maintain the data and run the assembly tools.”
Dr Coin and his team are using RCC-designed FlashLite to crunch the numbers. They are taking advantage of the HPC’s large memory node, abundant CPUs, fast I/O speed, and vast disk storage to assemble and analyse the genome. The group’s allocation of FlashLite resources includes a large disk quota of 10 TBs and big memory nodes of up to 4 TBs.
“We couldn’t do this analysis without a big memory, large storage and fast I/O speed. This project wouldn’t have been possible without FlashLite,” said Dr Coin.
The IMB team is also using a few other eResearch resources for data storage — managed through RCC — such as QRIScloud, a cloud computer operated by RCC collaborator QCIF, and the Polaris Data Centre in Springfield, Queensland.
Dr Coin and his team are aiming to complete the GT4SP project in the second half of next year. “Assembly of such a complex genome would be a world first, and nobody has completed a de novo genome assembly from scratch before,” said Dr Coin.
Once the sweet potato genome is assembled, it will become a significant global resource for association studies to identify critical genes for important traits. “Not only will researchers and breeders be able to grow better sweet potatoes, but the work will also open up a whole world of other studies on complex plant genomes,” explained Dr Coin.
To check if FlashLite is suitable for your research, please contact RCC Support: rcc-support@uq.edu.au.