TERN and RCC develop portal for reproducing scientific workflows and results

14 Jul 2016
Via CoESRA: The IUCN Red List of Ecosystems Risk assessment results and workflows for Victorian mountain ash forest.

Reproducibility of research results has long been a hot topic amongst scientists. As science becomes more data and computationally intensive, the harder it has become to reproduce others’ research. Not only is access to the data required, but also access to the same software, operating systems and other tools used.

A UQ-based team led by the Terrestrial Ecosystem Research Network (TERN) has taken a step towards addressing this issue by developing infrastructure for reproducible science in the form of a virtual desktop, accessed by a web-browser, called CoESRA: Collaborative Environment for Ecosystem Science Research and Analysis.

CoESRA, launched in July 2015, provides tools in the cloud and is equipped with the Kepler scientific workflow system and Nimrod software (other software can be added by users).

TERN in collaboration with RCC developed CoESRA, with funding from TERN and the Australian National Data Service through the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS) program. QRIScloud, the Queensland node of Nectar and RDS administered by QCIF, provides CoESRA’s compute and storage infrastructure. 

CoESRA aims to make ecosystem science research reproducible in a form others can repeat with minimal effort by providing easy access to the execution environment and resources to build, execute and share repeatable workflow-based scientific experiments — all without having to download any software or go through a merit-based allocation process. It enables users to share infrastructure, data and analysis via the cloud, which drastically reduces, or removes, setup costs for others to rerun the experiment.

Any Kepler or Nimrod-based experiment can be hosted and published on CoESRA and users can register to access the platform to execute the fully configured workflow.

“The Kepler workflow system was used due to its support for access to local and remote data, its large pool of reusable components and because it was initially developed for the discipline of ecology, ” said project lead Dr Siddeswara Guru, TERN’s Data Integration and Synthesis Manager.

“We have taken a holistic approach to reproducibility, where the complete computational analysis processes are available as an executable, irrespective of the complexity of the scientific research. This will make reproducibility a non-tedious task for other researchers, and where appropriate, assist the original researchers in the longer-term to repeat the analysis processes at different time intervals. These workflows will dramatically lower future barriers to sharing and repeating similar complex analyses and synthesis research.

“In future, we envisage that the reproducible experiments will be made available during the review process of the scientific paper. This will enable reviewers to check the experiment while reviewing the paper with minimal effort,” said Dr Guru.

Currently, an IUCN (International Union for Conservation of Nature) Red List of Ecosystems Risk assessment workflow for Victorian mountain ash forest is available in a completely reproducible form.

CoESRA is best accessed via a Google Chrome web browser. Those interested can register for an account using their Australian Access Federation credentials (i.e. their Australian institutional login) at https://coesra.org.au. Non-AAF users can request a guest account by emailing coesra.tern@gmail.com. The ‘how-to’ guide for the system is available at: https://www.coesra.org.au/#/faq.

Each CoESRA session is limited to 48 hours to ensure efficient use of the cloud infrastructure. Users can ask to extend the session if they need to have continuously running jobs on the machine.

RCC’s initial work on the project saw Systems Programmer Hoang Nguyen develop CoESRA’s core services while Research Fellow Dr Minh Dinh developed one of the first two workflows as illustrative use cases. “We are continuing this partnership to develop the platform further,” said Mr Nguyen. “In particular, RCC is developing several more workflows in CoESRA and working towards allowing users to execute workflows and interact with them from the Web, in addition to the current desktop interface.”

RCC and TERN are also working towards enabling easier overseas access and extending CoESRA’s use to a wider range of users.

Please contact Dr Guru for further information about CoESRA: s.guru@uq.edu.au.  

Latest