NVIDIA’s first Pascal-based Tesla P100 doesn’t use the full Pascal GPU core

Update April 8, 2016: It’s been spotted that the Tesla P100 GPU, the first based around NVIDIA’s brand new Pascal microarchitecture, features 56 SM units, whereas the GP100 Pascal GPU core design features 60 SM units. In other words, the Tesla P100 doesn’t use everything Pascal has on offer.

Overclock3d spotted the discrepancy between the Tesla P100’s specs and the GP100’s core design illustration, and they further speculate that fully enabled GP100-powered Pascal GPUs may reach “11.3TFLOPs of single precision performance.”

Why might NVIDIA not make use of the full Pascal core for this first model? Well, it’s a very complicated design, the first to be built using the 16nm FinFET process, with HBM 2 memory to boot. Basically combining all these new microscopic architectures is a nightmare, and since the Tesla P100s will be the first coming through the assembly line, perhaps NVIDIA want to focus on getting a high yield of working chips early doors before extending to Pascal’s full capability as the manufacturing process matures. That’s me guessing, of course.

Original story: Here’s what we’ve known about NVIDIA’s upcoming Pascal microarchitecture until yesterday’s GPU Technology Conference: its name, and the faint assurance that it’d be really, really powerful compared to current Maxwell cards. But now the era of idle speculation is (sort of) over. Finally, the graphics giant have pulled back the curtain on Pascal and its GP100 GPU, and released specs of its first incarnation, the Tesla P100.

Passionate about future technology? The best space games on PC are for you, forward-thinking friend.

Slightly unfortunately – although entirely to be expected – the Tesla P100 is intended for high-performance computing rather than gaming (making sure nukes don’t accidentally get fired/your video stream buffers quickly, that sort of thing). So although there is a detailed list of specs now out there for all to see, it’s hard to compare those specs to, say, a GTX 980 ti, and draw any meaningful conclusions.

Of course, you could just buy a Pascal-equipped server from Dell, IBM et al and sit on your hands until its likely arrival in early 2017. Or test out the 170 TFLOPS performance for yourself this June, if you have $129,000 burning a hole in your pocket in exchange for a DG-1 box. And a 3200W power supply.

So yes, entry requirements are currently a bit high to be talking about Pascal in a gaming scenario with any great insight. The focus of CEOJen-Hsun Huang’s GTC talk was very much deep learning, and intended to entice the scientific and supercomputing communities towards Pascal more than the 4K, 60 fps chasers.

It’s even possible that the GP100 won’t make it down to GeForce gaming cards, but will instead remain in the realm of ultra-expensive Tesla and Quadro ‘professional’ cards. That’d be unlikely though, judging by the company’s previous rollout strategies for new microarchitectures.

So, what do we actually know about how Pascal works afterJen-Hsun Huang’s talk? Well, firstly it’s much better at Unified Memory, which creates a shared virtual space that the CPU and GPU can access. Programmers have been offered Unified Memory since NVIDIA’s CUDA 6, but with Pascal’s GP100 comes the ability to trigger a page fault. It might not sound scintillating, but it means code running on the GPU now assumes its required data is already present in the GPU memory rather than writing it there, slowing the process down. If the data isn’t there yet, a page fault’s triggered and it’s then written over.

The other big-hitter on the spec sheet is HBM 2 memory. Unlike AMD’s Fiji, which uses HBM 1, the GP100 has four layers of stacked memory running on a 4096-bit bus at 1.4Gbps, resulting in a 720GB/s total memory bandwidth. And while we’re on an impressive number theme, the GP100 features 3584 CUDA cores, a1328 MHz base clock and1480 MHz boost. You can read the full spec list over at NVIDIA’s devblog.

There’s limited value in reeling off the specs currently though, because while they appear impressive when lined up next to current Maxwell GeForce GPUs, we just don’t know yet how Pascal will filter down into the realm of gaming hardware. Still, it’s interesting to see NVIDIA finally show their hand with the tech that will surely power the next bout of gamer-level market landgrabbing with AMD.

Thanks PC Gamer, The Register.