Nvidia Ampere graphics card release date, specs, and rumours

The rumours about Nvidia's next-gen GPUs are such salty ones we're developing coronary troubles...

Nvidia GPUs through the ages

Nvidia Ampere, the GPU codename that’s being taken as read for the next generation of high-end graphics processing from the green team… or maybe it’s not. Maybe it’s Nvidia Hopper – with Nvidia choosing to play its graphics cards close to its chest, refusing to let slip its future GPU architecture roadmap, we still don’t actually know what the next-gen naming scheme is going to be. And that’s despite us potentially being only a few months away from its unveiling.

The general consensus is that Nvidia Ampere will be the name given to the new architecture and that it was expected to be detailed for the first time at the Graphics Technology Conference (GTC) in March. But there’s also the chance that Ampere is purely the logical successor to Volta and will only be implemented on the company’s professional-level GPUs and not the gaming cards set for release towards the end of the year.

Equally there’s a chance Nvidia Ampere was simply the original name for the Turing GPU architecture. It’s possible concerns over contesting the trademark with another company prompted a switch to Turing as an alternative name for the RTX 20-series GPUs. Ampere Semiconductor is an existing Silicon Valley company also dealing in microprocessors and cloud computing and already has dibs on a bunch of Ampere trademarks.

There has been more recent speculation that the next generation of Nvidia graphics architectures will bear the name Hopper instead, potentially named after Grace Hopper. She was one of the pioneers of computer programming within the military, eventually serving as Rear Admiral in the US Navy. Her work influenced many early programming languages and her name looks to be the perfect choice to succeed the Turing architecture codename.

Towards the end of 2019 Nvidia trademarked ‘Nvidia Hopper’ as an API “for computer software for ray tracing and image rendering, modeling, and image manipulation and processing.” Essentially exactly what it states in the ‘Nvidia Turing’ trademark too. There has been no ‘Nvidia Ampere’ trademarking going down so there’s still a good chance that we’ll be looking at GH100 GPUs sometime around GTC in March. Or HP100. That sounds cooler.

But all the tantalising GPU rumours so far have been around the Ampere codename, so let’s just dive into that for now.

Vital stats

Nvidia Ampere release date
At the moment we’re expecting some sort of news about the next generation of Nvidia GPU architecture around the company’s GTC event from March 23 to March 26 2020. But in terms of actual next-gen GeForce graphics cards, we’re not expecting to see those until around September 2020 at the earliest.

Nvidia Ampere specs
The latest very speculative suggestion is that the top-end GA102 GPU will house a total of 5,376 CUDA cores across a massive 84SMs. Even at the promised 7nm scale that’s going to be a hefty chip. Similarly the touted GA103 GPU suggested to be at the heart of the RTX 3080 will be relatively chunky with 3,840 CUDA cores. Oh, and they might have 10GB of GDDR6 as standard… or 20GB. Ain’t rumours fun?

Nvidia Ampere performance
Whether the 7nm GPU is from TSMC or Samsung, the rumoured 50% performance increase while halving power consumption sounds too good to be true. Normally you’d expect either to be true – a 50% speed bump at the same power, or 50% more efficient at the same speed – but if Nvidia can pull both off at the same time Ampere will be a hell of an architecture.

Nvidia Ampere price
Can Nvidia go higher than it did with Turing? The RTX 2080 Ti coming in at $1,200 represented a hefty price increase over previous generations, but with AMD’s ‘Big Navi’ promising to increase competition Nvidia might not be able to repeat the price bump.

Jen-Hsun Huang

What is the Nvidia Ampere release date?

The next generation of Nvidia GPU architecture is expected to have some sort of unveiling during Jen-Hsun Huang’s opening keynote of the Nvidia Graphics Technology Conference in San Jose, running from March 23 to March 26. That’s more likely to be a look at the professional side of the architecture, however, as the GTC event is far more enterprise than consumer focused.

Therefore, if Ampere is indeed the name of the next-gen GPU, then we’d be expecting a look at the super high-end GA100 chip. That’s a potential monster of a pro-level GPU, sporting AI and rendering silicon to make your eyes water and pixels quiver. But it’s not going to be made for gaming and will probably only appear in server-side form.

For the GeForce-based gaming cards of this new generation we’re not expecting to see those any time this half of the year. We think Nvidia will follow a similar cadence to its Turing unveiling and start showing off the new consumer GPUs around Gamescom at the end of August with a potential launch in September.

Though it could depend on what happens with AMD’s Big Navi GPUs at the end of the year, Nvidia could decide to wait and see how they perform before finalising their own pricing and specs.

Bare Nvidia GPU

What are the Nvidia Ampere GPU specs?

The Nvidia Ampere specs are pure speculation right now. Winky face. There’s nothing concrete known about the upcoming chips, not the full configuration of the silicon, not the manufacturer, or the process node, and not even the true codename of the new generation. We’ve still got a sneaky feeling Hopper is the more likely candidate.

Hell, I’m not even convinced Nvidia’s going to follow the 20-series with something like an RTX 3080. That number just doesn’t float my boat.

Whatever, there have been some rumours presented as leaks doing the rounds, all sourced from the usual places – anonymous people ‘close to the matter,’ or simply randoms on Twitter. All very official, I’m sure you’ll agree. Realistically these are more likely to be vaguely educated guesses than actual leaks, so treat them more as interesting thought experiments rather than anything close to confirmed.

SMs CUDA cores Memory bus
Compute
GA100 128 8,192 6,144-bit
Gaming
GA102 84 5,376 384-bit
GA103 60 3,840 320-bit
GA104 48 3,072 256-bit
GA106 30 1,920 192-bit
GA107 20 1,280 128-bit

The numbers that a lot of the web have run with see the top-end professional Ampere chip, the GA100, sporting a potential total of 8,192 CUDA cores. Though that’s if Nvidia sticks to the current configuration of 64 CUDA cores per SM. Similarly the top consumer chip, the GA102 is suggested to be running with 5,376 CUDA cores.

The pro-level chips will operate using HBM2 and the consumer level cards will stick with GDDR6 for their respective memory goodness. But there is huge variation in how much memory the rumours think Nvidia will stick on its cards. The GA103, with 3,840 CUDA cores, is suggested to be the core of the RTX 3080, but that could ship with either 10GB of GDDR6 or 20GB of GDDR6.

Way to hedge your bets, ‘leakers.’

Honestly the fact the rumours are pointing to a GA103 chip sounds weird. Nvidia has had a GPU numbering scheme it’s been sticking too, and that’s never included an XX103 chip before.

But the core counts could be right out the window anyways if the rumoured SM changes doing the rounds are anything to go by. The homemade image posted by @CorgiKitty suggests that Nvidia is doubling the floating point units, the FP32 cores, inside each Ampere SM. It claims that the INT32 units will remain the same, with 64 in each SM, but that there will be 128 FP32 cores.

It also claims a doubling of the Tensor Cores, with a more advanced RT Core being attached to each of the SMs in an Ampere GPU too.

In essence, for floating point operations, that would mean a GA103 could almost be said to have 7,680 cores inside it. And floating point calculations are what your graphics card is doing most often, with Turing Nvidia stated that for every 100 FP instructions there were only 36 integer calculations needed. Ampere will still have those concurrent INT32 cores, but by beefing up the number of FP32 units the actual graphics power will be hugely increased.

But given that @CorgiKitty has now deleted that tweet and has now said that the image about “doubled FP32 units was just a conception, with no actual evidence behind it.” So I think we can take it that the INT32 and FP32 cores are likely to remain in the same essential configuration.

But who’s going to make Nvidia’s new graphics silicon? There had been some speculation that Samsung would be the foundry partner for Nvidia next-gen project, with the company reportedly offering some discounts to customers on 7nm in order to gain ground against the TSMC foundry competition.

But Jen-Hsun Huang himself, Nvidia’s effusive CEO, has stated that most of the company’s 7nm orders would still be going through TSMC, even though it was giving a small number of 7nm, orders over to Samsung.

@CorgiKitty has popped up again though, now claiming that the Ampere GPUs are defo going to be made by Samsung, but not on some 7nm or 8nm node, it’s going to be on 10nm. So. Much. Salt.

Nvidia Ampere or Hopper?

How will the Nvidia Ampere GPUs perform?

With the final GPU specs being so up in the air you’re not going to get anything concrete about performance either. Obviously moar corez means moar performance, and doubling the floating point units could help make Ampere a mean rasterising machine, but the shift in process node will also help things.

The latest performance speculation has come from the future installation of Nvidia next-gen GPUs in the Big Red 200 supercomputer at Indiana University. Originally set for Volta V100 GPUs, the decision was taken to delay the upgrade and allow for the use of next-gen Nvidia chips.

According to IU IT bods they needed fewer next-gen GPUs than they would have if they’d stuck with Volta, because the new graphics silicon offers between 70 – 75% higher performance.

Though that could just be based purely on AI performance…

There has been speculation coming out of an unlikely news source, the Taipei Times, which suggests that the new Nvidia graphics architecture will be able to manage the combined feat of offering both 50% more performance as well as a dramatic halving in power consumption in comparison with the already impressive Turing technology.

It’s possible that this has been misconstrued and is referencing an increase of 50% in terms of performance per Watt. That would mean either a 50% performance boost in the same power envelope or a 50% reduction in power draw at the same performance level.

Though if Nvidia manages to hit that proposed target of both a 50% perf bump and a halving of power consumption through a mix of architecture and process node changes then we could have one hell of a graphics chip on our hands. And that means the best graphics card lists are going to look very different this time next year.

With a doubling of the AI-heavy Tensor Cores, and the proposed RT Core Advanced silicon in the rumoured Ampere spec, it should also be offering a serious bump in ray tracing performance as well as rasterised prowess too.

Nvidia Turing TU102 GPU

How much will Nvidia Ampere cost?

The top-end Turing GeForce GPU, the RTX 2080 Ti, is still the fastest consumer graphics card available. Nvidia knew that at the time and that’s why it was able to price it at $1,200 accordingly. With no competition the green team could price its top chip with impunity, adding a huge premium on top of the highest-spec GPU of the Pascal-based GeForce generation, the GTX 1080 Ti.

Will that happen again? Will Nvidia bump the price of the RTX 3080 Ti even higher? Are we going to see a $1,500 card at the top of the Ampere range? I’m not convinced. With AMD seemingly sure its ‘Big Navi’ GPU will be able to close the gap on its green-tinged competition it looks like Nvidia isn’t going to have the 4K realm to itself in the next generation.

And, thankfully, means there should be greater competition across all sections of the GPU market, which in turn should help bring pricing back to a more sensible level. Unless AMD decides to just follow Nvidia in going big on price…