RCC's new framework to simplify digital instrument-based research

5 Sep 2019

RCC has developed a new framework to make it easier for researchers who use digital instruments to manage and analyse their data.

The new framework, CAMERA, will facilitate the CApture, ManagEment, stoRage and Analysis of data from microscopes, scanners and sequencers.

CAMERA supports a complete life cycle for instrument-gathered data, seamlessly rendering it on a range of instruments, cloud computers, desktops and high-performance computers (HPCs).

RCC Director Prof. David Abramson said that while it is possible to move data manually between instruments, storage systems and computers, “it is preferable to make access as transparent as possible to simplify the job of the researchers.

“While digital instruments have been common for some years, the exponential growth in data volume and velocity is challenging for computational infrastructure, and this demands innovative and powerful solutions be developed,” he said.

The RCC team has implemented an efficient solution for instrument pipelines while keeping the system simple for researchers. CAMERA achieves this by leveraging powerful underlying technologies, such as high-throughput networks and storage systems while hiding this complexity from the users.

CAMERA appears as a single platform, but its backend is actually an aggregate of the best-of-breed technologies, such as data repositories and metadata management systems.

While CAMERA itself is new, it leverages existing open source and commercial software and infrastructure extensively.

CAMERA achieves seamless inter-operation with HPCs without the need to copy files in and out of a repository.

It also supports a number of managed data repositories, such as OMERO and MyTardis, and the team is looking at integration with XNAT and IMS. These repositories are the primary view of the data from the instrument and the operator, and facilitate metadata management, search, sharing, and simple visualisation and processing tasks.

At UQ specifically, CAMERA builds on and extends the UQ Research Data Management platform (UQRDM), which simplifies the task of requesting data storage; the Metropolitan Data Caching Infrastructure (MeDiCI) data fabric, which simplifies the process of accessing data; and a variety of image repository stacks.

While the repository stack is executed on a QRIScloud (secure cloud computer) node, data is actually stored on MeDiCI. MeDiCI stores a single copy of the data, avoiding unnecessary replication, but provides multiple access protocols and views.

MeDiCI makes it simple to expose a data collection to a repository but later mount that same data on an HPC cluster, and while data movement might be necessary, it is organised transparently.

Importantly, CAMERA leverages UQRDM, through which researchers request a data collection prior to their laboratory work and link this collection to the repository. This means that the project-level metadata is captured once in UQRDM, and the University can manage the provenance of the data in the same way as all other research data. MeDiCI then facilitates the transparent access of the data throughout the analysis pipeline, and also provides mechanisms for sharing data with collaborators.

Prof. Abramson will present a talk about CAMERA at this year’s eResearch Australasia conference, on Tuesday, 22 October, 5:10pm–5:30pm, at the Brisbane Convention and Exhibition Centre.

Please contact the RCC Support Desk if you have any queries about CAMERA or would like to trial its use: rcc-support@uq.edu.au.

Latest