A recent upgrade to The University of Queensland’s supercomputer will see better flexibility and faster results for researchers, especially for AI workloads.
The environment of UQ HPC Bunya has been modernised with the installation of Rocky Linux 9 (Rocky 9), a premier, stable and free enterprise-grade operating system (OS) for HPCs that is heavily used in modern cluster deployments.
RCC Senior Principal Scientific Frameworks Officer Oliver Cairncross said in recent production tests following the installation of Rocky 9, RCC’s team achieved a sustained throughput of approximately 4,500 tokens* per second using a single high-end GPU on Bunya.
“This allowed us to process complex structured classification tasks across 7,500 documents in under 45 minutes. This type of workload was previously inefficient or technically impossible to run at this scale,” said Oliver.
RCC Director Jake Carroll said HPC modernisation is a perpetual task. “It is like painting the Sydney Harbour Bridge in many ways. We complete one pass and then immediately go back to the start and do it all over again.
“Modern kernels, libraries, and drivers are needed to derive as much performance and thus research outputs from our infrastructure as possible,” said Jake.
“This is also about efficiency and sustainability. More efficient execution of workloads means clearing the queues faster.
“Modern kernels can mean lower power consumption per task due to more efficient use of the hardware in our modern GPU and CPU platforms.
“The smoothness and careful execution of this major upgrade is testament to the planning and diligence our team has put into this initiative – I'm incredibly proud of them,” said Jake.
By moving to a more modern OS, RCC can now natively support the latest high-performance AI software stacks across Bunya’s AMD and NVIDIA GPU nodes.
Major HPC centres globally are migrating to Rocky 9 to take advantage of updated software stacks.
“Rocky 9 will resolve software versioning limitations that necessitated sub-optimal and fragile workarounds,” said Oliver.
Jake said: “AI and HPC workloads have long been one and the same. AI is a HPC problem to solve.”
Rocky 9 brings newer kernels (5.14+) and updated networking stacks that improve Remote Direct Memory Access (RDMA) and TCP performance which directly impacts NCCL/RCCL performance for collective operations. In practice, this means better scaling efficiency across nodes, reducing communication bottlenecks in distributed training jobs.
Rocky 9 provides better CPU-side performance for data pipelines as well. Newer system libraries enable better vectorisation and threading, improving input pipeline throughput, helping to keep both CPU and GPUs fed, reducing idle time and I/O wait.
Rocky 9 defaults to more modern I/O schedulers (mq-deadline/none) for NVMe and better kernel handling of parallel I/O which improves data streaming performance.
Rocky 9 has a 10-year support lifecycle (until May 2032), providing long-term stability, essential for research workloads.
*In modern AI-driven HPC, a “token” represents a unit of data (roughly four characters or 0.75 words) processed by a large language model (LLM).