Chip maker Nvidia has a new machine learning-powered program that lets you turn what look like crappy MSPaint drawings (though there’s more to it than that) into realistic landscapes. Just paint a few large blocks of color that tell the computer where you want the beach and where you want the clouds, and the program fills in the blanks to generate the perfect Instagram-ready image.
Nividia calls this new program GauGAN and it works thanks to generative adversarial networks, or GANs. These programs “learn” by scanning a vast amount of training inputs—photographs of landscapes in this case—to produce new examples.
Nvidia has done plenty of work with GANS lately, and has already released bits of its code on GitHub. Clever folks have used it to created programs that generate random human faces and non-existent cats.
GauGAN allows users to select basic elements like water, snow, grass, or gravel, then paint broad swaths of MSPaint-style blocks. Once the element is placed, GauGAN reaches into its neural network and fills in the details to create a beautiful picture.
“It’s like a coloring book picture that describes where a tree is, where the sun is, where the sky is,” Bryan Catanzaro, Nvidia’s vice president of applied deep learning research, said in a video about the project.
A GAN does more than just stitch together elements from the millions of photos it’s scanned—the architecture comprises two different processes working in concert to generate images. For example, a “generator” AI creates an image of a tree and presents it to a “discriminator” AI. The discriminator is trained to analyze an image and decide if it’s real, or a creation of the generator. The tree only pops into existence once the generator has created a tree that fools the discriminator. This back-and-forth is where GANs get the “adversarial” part of their name.
Nvidia will present a research paper on GrauGAN at the Conference on Computer Vision and Pattern Recognition in June. The paper goes deeper than simple landscapes—Nvidia is also using GrauGAN to build photographs of surfers, ducks in a pond, home decor, food, and complex street scenes.
According to Catanzaro, this tech could be used to quickly (and thus, cheaply) create detailed virtual worlds to, say, train self-driving cars before they hit the road.