NVIDIA has released the latest version of its wildly popular AI painting demo called GauGAN2. Successor to GauGAN, the new GauGAN2 text-to-image feature allows anyone to channel their imagination into photorealistic masterpieces.
The user simply need to type a phrase like “sunset at a beach” and AI generates the scene in real time. Add an additional adjective like “sunset at a rocky beach,” or swap “sunset” to “afternoon” or “rainy day” and the model, based on generative adversarial networks, instantly modifies the picture. It’s an iterative process, where every word the user types into the text box adds more to the AI-created image.
With the press of a button, users can generate a segmentation map, a high-level outline that shows the location of objects in the scene. From there, they can switch to drawing, tweaking the scene with rough sketches using labels like sky, tree, rock and river, allowing the smart paintbrush to incorporate these doodles into stunning images.
With the versatility of text prompts and sketches, GauGAN2 lets users create and customize scenes more quickly and with finer control. GauGAN2 combines segmentation mapping, inpainting and text-to-image generation in a single model, making it a powerful tool to create photorealistic art with a mix of words and drawings.
The demo is one of the first to combine multiple modalities — text, semantic segmentation, sketch and style — within a single GAN framework. This makes it faster and easier to turn an artist’s vision into a high-quality AI-generated image.
Rather than needing to draw out every element of an imagined scene, users can enter a brief phrase to quickly generate the key features and theme of an image, such as a snow-capped mountain range. This starting point can then be customized with sketches to make a specific mountain taller or add a couple trees in the foreground, or clouds in the sky.
It doesn’t just create realistic images — artists can also use the demo to depict otherworldly landscapes. Imagine for instance, recreating a landscape from the iconic planet of Tatooine in the Star Warsfranchise, which has two suns. All that’s needed is the text “desert hills sun” to create a starting point, after which users can quickly sketch in a second sun.
The AI model behind GauGAN2 was trained on 10 million high-quality landscape images using the NVIDIA Selene supercomputer, an NVIDIA DGX SuperPOD system that’s among the world’s 10 most powerful supercomputers. The researchers used a neural network that learns the connection between words and the visuals they correspond to like “winter,” “foggy” or “rainbow.”
Compared to state-of-the-art models specifically for text-to-image or segmentation map-to-image applications, the neural network behind GauGAN2 produces a greater variety and higher quality of images.
The GauGAN2 research demo illustrates the future possibilities for powerful image-generation tools for artists. One example is the NVIDIA Canvas app, which is based on GauGAN technology and available to download for anyone with an NVIDIA RTX GPU.
Source: indiaai.gov.in