Google Imagen: text-to-image AI

Alex Bobes
2 min readMay 25, 2022

--

Google is working on a machine learning system that can generate graphics from text input. Users can submit any descriptive words, which the AI will convert into an image. The Imagen diffusion model, developed by Google Research Brain Team, provides an extraordinary degree of photorealism and a deep level of language understanding.

This isn’t the first time such AI models have appeared. Because of its ability to turn words into graphics, OpenAI’s DALL-E (and its successor) generated headlines as well as images. Google’s version, on the other hand, attempts to produce more realistic visuals.

The researchers built a benchmark called DrawBench to compare Imagen to other text-to-image models (such as DALL-E 2, VQ-GAN+CLIP, and Latent Diffusion Models). Each model was programmed with a set of 200 text prompts. Each photograph was evaluated by human raters. “In side-by-side comparisons, they choose Imagen over other models, both in terms of sample quality and image-text alignment,” Google added.

Imagen AI generated images

Imagen, like DALL-E, is not available to the general public. For a variety of reasons, Google believes it is not yet suitable for usage by the general public. For starters, text-to-image models are usually trained on massive datasets that are scraped from the web and are not curated, which causes a slew of issues.

However, the researchers may someday allow members of the public to enter text into a version of the model to create their own visuals. “We will investigate a framework for responsible externalization in future work,” the researchers stated, “that balances the usefulness of external auditing with the hazards of unrestrained open-access.”

You can, however, try Imagen for a limited time. You can construct a description using pre-selected terms on their website. Users can choose whether the image is a photograph or an oil painting, as well as the sort of animal seen, their attire, the action they’re performing, and the scene. So, if you’ve ever wanted to see a version of an oil painting featuring a fuzzy panda skateboarding on a beach while wearing sunglasses and a black leather jacket, now’s your chance.

Originally published at https://alexbobes.com on May 25, 2022.

--

--

Alex Bobes
Alex Bobes

Written by Alex Bobes

Technology expert. I’m writing about Web 3.0, Crypto, Blockchain, AI, and other topics. We’re now living in the age of the algorithm.

No responses yet