Meta’s new AI image generator was trained on 1.1 billion Instagram and Facebook photos

On Wednesday, Meta released a free standalone AI image generator website, "Imagine with Meta AI," based on its Emu image synthesis model. Meta used 1.1 billion publicly visible Facebook and Instagram images to train the AI model, which can render a novel image from a written prompt. Previously, Meta's version of this technology—using the same data—was only available in messaging and social networking apps such as Instagram.

Meta's model generally creates photorealistic images well, but not as well as Midjourney. It can handle complex prompts better than Stable Diffusion XL, but perhaps not as well as DALL-E 3. It doesn't seem to do text rendering well at all, and it handles different media outputs like watercolors, embroidery, and pen-and-ink with mixed results. Its images of people seem to include diversity in ethnic backgrounds. Overall, it seems about average these days in terms of AI image synthesis.

Based on a research paper released by Meta in September, Emu gets its ability to generate high-quality images through a process called "quality-tuning." Unlike traditional text-to-image models trained with large numbers of image-text pairs, Emu focuses on "aesthetic alignment" after pre-training, using a set of relatively small but visually appealing images.

At Emu's heart, however, is the aforementioned massive pre-training dataset of 1.1 billion text-image pairs pulled from Facebook and Instagram. In the Emu research paper, Meta does not specify where that training data came from, but reports from Meta Connect 2023 conference cite Meta president of global affairs Nick Clegg confirming that they were using social media posts as training data for AI models, including images fed into Emu.

Comments