Nvidia’s AI is getting smarter at a stunning rate. A new deep learning model over at Nvidia Research can turn your childish doodles into photorealistic masterpieces, complete with reflections, seasonal changes, and shadows.
The AI-infused app is called GauGAN, a nod to post-impressionist French painter Paul Gauguin. Unlike Gauguin, however, Nvidia is hoping that the world will appreciate its latest breakthrough long before the company’s no longer around. Gauguin wouldn’t find his fame until after his death in 1903.
And while GauGAN is awfully talented at creating photorealistic imagery from little more than a stick figure, it does so without any knowledge of what a tree, lake, or ocean actually is. It’s never even experienced the warmth of the sun on its cold, metal exoskeleton. Instead it utilises a pair of networks to generate what it believes to be the correct interpretation. This is achieved through a generator and a discriminator. The generator actually creates the images, while the discriminator, trained on datasets of real-life images, tells it where it went wrong, pixel-by-pixel. A digital Miss Honey, of sorts.
The system is catered towards nature imagery and open landscapes, but the neural network plugging away in the background is capable of much more. Nvidia has trained it to recognise buildings, roads, and people.
Nvidia has already demonstrated its deep learning algorithms are particularly handy at generating realistic, lifelike human faces. But it can’t do cats. There’s just something about these cute little critters that an AI can’t get its virtual head around, instead coming out like slithery jumbles of legs, fur, and whiskers.
But GauGAN is pretty darn good at what it does. From just a simple sketch on a program that is even more basic than MSPaint, the AI can generate spectacular, beautiful imagery of nowhere in particular – an entirely made-up yet photorealistic scene generated before your very eyes.
“This technology is not just stitching together pieces of other images, or cutting and pasting textures,” Bryan Catanzaro, VP of applied deep learning at Nvidia, says. “It’s actually synthesizing new images, very similar to how an artist would draw something.”
The tech is built upon a research paper called Semantic Image Synthesis with Spatially-Adaptive Normalisation, authored by Nvidia researchers Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. Park carried out the research during an Nvidia internship.
The paper goes into the actual neural net layer responsible for synthesizing realistic images, and important cross-analysis with competing methods. During a user preference study, the majority of users preferred SPADE (Nvidia) results over competing approaches, even utilising challenging, complex datasets such as COCO-Stuff. As you can see in the image above, while Nvidia’s AI has created an awfully dumpy Zebra, for the most part the results are rather impressive.
So if you want to put together a backdrop, reimagine a space, or lie on Instagram that you’ve been on holiday, Nvidia may soon have the AI for you. Zero artistic skill required.