Google‘s most advanced image generator has arrived, months after the tech giant teased the model at this year’s Google I/O event. The Imagen 3 model is now available through Google’s Gemini AI platform, both the free version and the subscription-based Gemini Advanced service, as well as within Google’s business products. Google is clearly keen for Imagen 3 to compete with the rapidly mushrooming competition among AI image generators with its own approach to turning words into images.
Like its predecessors, Imagen 3 can create images in any number of styles, including the photorealistic landscapes and cartoonish claymation seen above. The new version improves on Imagen 2 in many ways, particularly when it comes to making pictures of people. The company hinted strongly that you won’t see Imagen 3 fall into the historical errors that embarrassed the company earlier this year. That said, “photorealistic, identifiable individuals” are still forbidden.
Imagen 3 also includes the real-time editing options spotted in the code last month. You can tell Gemini your opinion on generated images and instruct the AI to change it in whatever way you prefer. The company didn’t mention being able to circle the part of the image you want adjusted, but that may come later. Imagen 3 has been integrated across Gemini, starting in English, but with more languages on the way. Imagen 3 is supposed to serve as a major draw for Gemini, which Google seems to want people to turn to as a default option, similar to how so many people unthinkingly go to its search engine.
AI Image War
Imagen 3 also continues Google’s marking of visuals with the SynthID tool for watermarking AI-generated images created with Gemini. SynthID embeds invisible watermarks into images, so you won’t notice it, but an attempt to pass it off as a real photo or something you painted would be debunked quickly. Google describes it as a way of pushing back against misinformation and making the world of AI images more transparent. SynthID is another of the safety measures employed by Google for Imagen 3, along with its guardrails against producing pictures of people, violent imagery, and other problematic scenes.
Imagen 3 is a clear indicator of the rapid advancements in AI image creation and their integration into all sorts of content creation platforms. That’s one area where Google has an edge over most of its completion. Ideogram, Midjourney, and other AI image makers tend to be stand-alone tools. On the other hand, OpenAI has DALL-E as a key feature for ChatGPT, and X recently embedded Flux into the Grok AI chatbot. Imagen 3 combined with Gemini gives Google a definite boost, but there’s no way of knowing which, if any, of the AI image generators will dominate the race. It will be a photo(realistic) finish.