The University of Queensland’s (UQ) Research Computing Centre (RCC), in collaboration with Hewlett Packard Enterprise (HPE), has commissioned the University’s second next-generation GPU platform for the most demanding artificial intelligence and high-performance computing applications.
The HPE Cray XD675, powered with AMD EPYC™ 9004 Series CPUs and AMD Instinct™ MI300X GPUs, will be deployed in UQ’s Bunya supercomputer — and will support the University’s rapidly expanding initiatives in supercomputing to accelerate AI at scale projects such as generative AI.
The XD675 provides additional solutions to UQ with cutting-edge GPU infrastructure to researchers.
Late last year, Lenovo provided UQ with its first AMD MI300x, inside Lenovo’s ThinkSystem SR685a v3 server, for trial and testing in a university scientific research setting. At the time, UQ was the first in Australia’s education sector to trial AMD’s next-generation GPU hardware.
The MI300X GPUs offer researchers an unprecedented amount of GPU memory and high-performance GPUs, critical for the demands of modern machine vision, offline large language model (LLM) research and other accelerated supercomputing workloads.
RCC Director Jake Carroll said he is delighted to enable the research community with such significant GPU resources.
“We’re thankful and pleased HPE is able to support such an initiative with us. The difference this makes in research time-to-discovery, with these ‘extreme’ GPU platforms, is impactful, sector leading and game changing. It kickstarts a whole new generation of GPU and accelerated infrastructure capability for us,” said Jake.
“Researchers have been exploring this new technology with us for many months, but now we’re in a fortunate position to broaden the access to these systems and enable even more science.”
The XD675 is an addition to the existing Lenovo SR685a v3 MI300x, enabling another memory-dense, highly-connected GPU system, well suited for large model training, demanding traditional high-performance computing and LLM inference.
Delivering up to 1307 teraflops of FP16 peak performance with 192 GB of HBM3 per GPU, the AMD Instinct MI300X can run up to 80 billion parameter LLMs entirely in-memory on a single GPU, not requiring models to be spread across GPUs. Exceptional memory density means more inference jobs per GPU, which is critically important for Generative AI (GenAI) demands.
Access to RCC’s AMD MI300X GPU infrastructure is available now to all UQ researchers with an account on Bunya. Visit RCC’s Bunya webpage to learn how to get an account on the supercomputer.
For more information, please read our article about the University’s first environment featuring AMD Instinct MI300X GPUs.
A HPE Cray XD675 server. Image from HPE's "QuickSpecs" for the server.