AMD’s supercomputer smarts will supercharge its next-gen CPUs and GPUs

Next-gen AMD CPUs and GPUs will utilise some of the same technologies, interconnects, and optimisations planned for the US DOE, Oak Ridge National Laboratory’s supercomputer, the company’s CEO says. The Frontier system, to be delivered in 2021, is being produced by Cray and AMD, and will be stacked full of AMD EPYC and Radeon Instinct GPUs to accelerate modelling and simulation of complex structures, weather, and further research – but even the rest of us are set to gain from AMD’s latest exascale venture.

Over at the Hot Chips Symposium 31, AMD’s CEO Dr. Lisa Su took to the stage to talk a little more on the semiconductor industry’s future. Of course, that means it quickly got sidetracked by questions about Crossfire coming back and 7nm Threadripper, which were reported from the event, but, nevertheless, also nestled in the now publicly available VODs were some fascinating tidbits about AMD’s upcoming next-gen silicon development.

“AMD announced together in partnership with Oak Ridge National Labs, the Department of Energy, as well as Cray, the next generation supercomputer at Oakridge,” Lisa Su says, “which is codenamed Frontier. Frontier is the system that will take advantage of all of these optimizations that I talked about, so highly optimized CPU, highly optimized GPU, highly optimized coherent interconnect between CPU and GPU, [and] working together with Cray on the node to node latency characteristics really enables us to put together a leadership system.”

“You know, of course, although these technologies are being designed for the highest performance supercomputers that have tremendous power and system capabilities, you’ll see all these technologies also come into more commercial systems. And you’ll see this in our next generation CPUs, as well as our next generation GPUs, to increase the computational efficiency.”

Su outlines what it takes to improve computational efficiency in 2019, and it’s no longer a denser, bigger, better processor. Su outlines a few areas that are lagging behind the trajectory of traditional compute. According to the red team boss, memory bandwidth and communication between CPU and GPU are developing at a slower pace, and it’s up to AMD, among others in the industry, to speed up the pace of change and close the gap as the world looks towards specialised, application-specific accelerators and ICs to get the job done.

“Although we’re continuing to invest in very, very close coupling between the memory and the computing element, the knowledge of memory bandwidth over time is not quite keeping up. So the ratio of memory bandwidth to to compute operations is diverging a bit. We are big believers in high bandwidth memory, I think high bandwidth memory has a strong roadmap that is important to continue on an aggressive pace”

We’re actually already seeing some of this trickle-down innovation today with the likes of Rapid Packed Math, which helped the Vega architecture maintain its compute dominance. Built to tackle various algorithms with greater efficiency in the big data world, turns out getting two FP16 operations done in lieu of a single FP32 operation can increase efficiency even in games. That is, so long as any devs decide to use it… which unfortunately almost none did – making it about as successful as real trickle-down economics. Still, an industry-wide technology may gain traction where Rapid Packed Math did not.

As for interconnect technology, AMD already employs its own Infinity Fabric interconnect between dies on its Ryzen 3000 and EPYC CPUs, yet it’s industry-wide interconnect specifications like Compute Express Link – whose members include AMD, Nvidia, and Intel – that could provide greater bandwidth and performance for datacentres and beyond. After all, the CXL interconnect doesn’t just stop at AMD products. Built upon the PCIe 5.0 connection, CXL intends on bridging the gap between CPU, GPU, accelerator, ASIC, FPGA, and more, from all manufacturers, with a low-overhead solution. And in today’s multi-node, highly-optimised world, that could be pivotal to performance.

This vast bandwidth won’t make much difference for game performance right now, however. But that’s not to say it won’t one day. As Su also outlines at Hot Chips, computer hardware performance is just as much a software problem as it is a hardware one nowadays.

“As we go forward,” Su continues, “there’s even more need to have this hardware and software co-optimisation so that we get the best out of the performance capabilities.”

And as GPU and CPU ecosystems and open standards develop from the same names and faces we’re familiar with in the gaming world, the more of that HPC technology, bandwidth, and resulting performance we’ll see make its way into game engines and APIs, inevitably set to one day make better use of the graphics card and CPUs within our gaming machines. So while you won’t get to game on the $600m Frontier exascale computer directly, even projects as grand as this one do help the humble gamer in one way or another.