RCC staff report on SC25

18 Dec 2025

Three Research Computing Centre staff travelled to St Louis, USA, last month to attend the world's largest international high-performance computing conference, Supercomputing 2025 (SC25). 

RCC's Jake Carroll, Ashley Wright and Oliver Cairncross attended SC25 (also known as the International Conference for High Performance Computing, Networking, Storage, and Analysis) from 16-21 November 2025. 

Read each of their reports about SC25 below. Quick links to their reports as follows:

The SC event brings together leaders, innovators, professionals, academics and students from around the world to explore the latest advancements in high-performance computing.

The SC program is designed to share best practices in areas of HPC such as: algorithms; applications; architectures and networks; clouds and distributed computing; data analytics, visualisation, and storage; machine learning and HPC; performance; programming systems; system software; and state of the practice in large-scale deployment and integration.

RCC staff outside the SC25 venue.
Ashley Wright, Jake Carroll and Oliver Cairncross outside the SC25 venue.

SC25: AI and data now dominate HPC

By Jake Carroll, RCC Director

Many of the trends showcased at this year’s Supercomputing Conference (SC25) align very closely with the direction of research computing at The University of Queensland.

SC25 was held in St Louis, USA, bringing together more than 16,000 researchers, technologists, research computing professionals, vendors, and policymakers from around the world.

RCC attended the annual event to stay informed about global developments in high-performance computing (HPC) and digital research infrastructure.

The three RCC staff who attended, including myself, met with UQ’s key technology partners in high performance computing, research data storage and software innovation.

Rather than just a “supercomputing show”, SC25 has evolved into a key global forum for research infrastructure directions, future computing capability, scientific workflows, data, and AI.

Main takeaway: HPC + AI + data

Over the last decade, HPC was mostly discussed in terms of supercomputers used for traditional simulation, such as in climate science or physics. SC25 made it clear that modern research computing is now a merging of. Rather than being separate fields: AI, simulation, data analytics, and real-time decision systems are now one converged capability.

Many keynote speakers and exhibitors described national and institutional computing strategies as “AI-first”, meaning infrastructure is increasingly designed to support large-scale machine learning, scientific AI, data workflows, and simulation in combination.

The global exascale moment

The world now has multiple “exascale-class” systems (machines capable of more than one billion-billion calculations per second), with new systems across the US and Europe announced at SC25.

While Australia is unlikely to build an exascale system in the near term, the technologies being developed for these systems (accelerators, high-speed networking, new software models, liquid cooling) will ultimately shape what universities will buy in the next 3-8 years.

Internationally, researchers increasingly collaborate by remote access to overseas platforms. The European exascale machine “JUPITER” went live this year, which opens new opportunities for collaboration—especially in research domains where UQ is very active.

Most important trend: performance is no longer just about speed

Traditionally, the supercomputing world focused on “FLOPS”—how many mathematical operations per second a machine could perform. SC25 showed a fundamental shift away from this. Success is increasingly measured by:

  • time-to-science
  • real-time capability
  • energy use
  • data throughput
  • workflow automation
  • GPU acceleration
  • ability to support diverse disciplines
  • and scientific impact.

This is extremely relevant to UQ, because researchers here increasingly care less about “world-record computing power” and more about:

  • GPU access
  • large storage
  • faster turnaround
  • easy-to-use software
  • workflow automation
  • AI-ready platforms.

The growth strategy of UQ supercomputer Bunya aligns strongly with this shift.

AI for scientific acceleration

The most impressive scientific result showcased at SC25 was a real-time tsunami prediction system, capable of producing a full forecast in under 0.2 seconds. Instead of simply “running a huge simulation”, the project combined:

  • AI
  • Bayesian inference
  • acceleration hardware
  • and scientific modelling.

This illustrates that the future of scientific computing is likely not “bigger models” but rather smarter workflows with specialised models combining HPC + AI + research domain expertise.

Infrastructure and cooling: a practical shift

A surprisingly visible theme this year was infrastructure rather than computing hardware. Many exhibitors were demonstrating:

  • liquid cooling technology
  • power distribution systems
  • facility racks
  • datacentre energy solutions
  • and water-based cooling systems.

The clear message is that research computing is now limited more by power and cooling than by silicon and chip design.

Summary

SC25 confirmed a major global shift. HPC is no longer just about traditional simulation. The future of research computing is becoming:

  • AI-accelerated
  • data-centric
  • workflow-focused
  • energy-aware
  • and impact-driven.

For UQ, this validates several strategic directions already underway through the University’s Research Infrastructure portfolio (of which RCC is a part), particularly in AI, GPU computing, data infrastructure, research software engineering support, and the Bunya platform roadmap.

Rather than chasing the world’s fastest computer, UQ is investing where global science is going—integrated computing, data, and AI capability supporting real scientific outcomes.

 

AI is transforming supercomputing 

By Ashley Wright, RCC Senior Manager—Digital Research Infrastructure 

One of the standout features of the Supercomputing 2025 conference was the workshops, where numerous researchers showcased the integration of Artificial Intelligence and AI tools within traditional supercomputing workflows.  

These sessions not only showcased innovative applications but also provided practical examples of AI transforming high-performance computing.  

The workshops offered a unique perspective on the evolving synergy between AI and supercomputing. This was the standout concept for me, and it was reflected throughout the conference. 

The conference floor was filled with activity, packed with vendors focused on data center infrastructure. Cutting-edge cooling technologies, systems for delivering vast amounts of electricity, massive pipes for handling immense water flow, and extensive networking solutions were all on display.  

Every aspect centered on the challenge of channeling enormous amounts of power, heat, and water into a remarkably compact space. The sheer scale and innovation highlighted the critical role that infrastructure plays in supporting next-generation supercomputing. 

The vendor talks also had an immense amount of information to share, much of which is covered by non-disclosure agreements (NDAs), so I won’t discuss it here. However, you might start to see a lot of this software and hardware appear on UQ’s tech platforms as we integrate our learnings over the next 12 months. 

Attending SC25 was an exciting and inspiring experience, and I plan to use the knowledge gained to evolve RCC’s research infrastructure plans in the future. 

 

“The sheer size and scale of AI can’t be denied” 

By Oliver CairncrossRCC Research Analyst 

Throughout the Supercomputing 2025 (SC25) conference, I attended machine learning (ML) and Artificial Intelligence streams.  

It is very apparent that AI is dominating the industry and is having a tremendous impact on the shape of high-performance computing regardless of sub-discipline.  

The ‘old guard’ representing the ‘traditional’ computational camp had to be content throwing the occasional light jabs at the AI camp. The sheer size and scale of AI can’t be denied.   

Here is some of what I observed at SC25: 

AI is having a positive impact on computational fields. Surrogate ML models are trained with output from computationally expensive numerical models. These surrogates can then be used to approximate the behaviour of the numerical models at a much lower cost. They don’t necessarily replace the numerical models but complement them and serve as tools to help improve them.  

Digital twins are computational replicas of physical systems (for example, a particle collider). Live sensor data is sent to the twin, allowing it to maintain a state synchronised with the physical asset. This can be used for monitoring and incorporated into control loops.  

Digital twins have leveraged ML in cases where a purely physics-based twin is too slow to operate in real time.


There were many questions regarding the impact of AI—namely code-generating LLMs—on teaching the next generation of computer scientists. Questions were raised at almost every panel session I attended. 

The answers ranged from making assessment solely exam-based to assessing the ability to use LLMs to solve problems (but making the problems extremely difficult).  

The issue of how LLMs would impact the development of critical thinking was often raised but without a good answer. I don’t think SC25 was the correct venue to tackle this very important issue. Nonetheless, it was frequently discussed.


There were many interesting and diverse uses of LLMs in general research. It would be impossible to track everything occurring in this space.  

I learned that sometimes an unlikely approach worked when I would have thought “there’s no way that will work!”  

Failed approaches were also discussed. The failure modes when LLMs are involved can be random and chaotic. It’s important to keep this in mind; newly developed workflows incorporating this technology must be monitored carefully and treated with caution.
 



By the research undertaken on extreme scale systems, we can see that AI infrastructure (GPUs and software stacks) are not as stable as we would expect compared with other more traditional HPC infrastructure. Work must be done to build resiliency and reliability both at a lower layer (hardware) and a higher layer (software frameworks and tools). 

I attended some sessions on network communications and learned that GPU protocols are developed with a large emphasis to mitigate crashes during operation.  

In this arena, it is more important to deploy new hardware and software as quickly as possible. If there are problems, they will be dealt with in production.  

Rebooting hardware or rolling back software is proving more commonplace at extreme scale. As RCC grows its own AI and GPU focused Digital Research Infrastructure, we must be mindful of these compromises we are seeing in large scale deployments so we can actively mitigate, improve and stabilise, enabling a consistent level of service for our researchers.  

Oliver Cairncross at SC25
Oliver Cairncross on the showfloor at SC25. (Photo by Jake Carroll.)

Latest