Meta has announced new generative AI tools to help users edit photos and produce videos using text descriptions. The features are built on Emu, Meta’s model for image generation. It has been pre-trained using 1.1 billion image-text pairs and then fine-tuned using thousands of selected high-quality images, says Meta.
What is Emu Video?
Emu Video is Meta’s generative AI tool for video creation. The platform is built on Meta’s Emu model, and works via text-to-video generation or it can inject some movement into still images.
“We’ve split the process into two steps: first, generating images conditioned on a text prompt, and then generating video conditioned on both the text and the generated image. This “factorised” or split approach to video generation lets us train video generation models efficiently.”
Realistic videos, quickly
It builds on the work of its previous Make-A-Video model to generate 515×512 four-second long videos at 16 frames per second. Meta says Emu Video is preferred by 96% of people over Make-A-Video in terms of quality and 85% in terms of faithfulness to the text prompt.
As mentioned, it can also animate provided images based on a text prompt.
Clearly, Emu Video can help businesses create videos quickly and easily – without purchasing expensive equipment or subscribing to editing tools. However, Emu Video is clearly in the early stages of development and shouldn’t be relied upon right now. Results are likely to be patchy and basic.
What’s more, it’s not trying to replace editors and human creative minds. It’s simply trying to add another bow to marketers’ arrows.
What is Emu Edit?
Emu Edit is another generative tool announced by Meta. It allows users to later images based on text inputs and works in a similar way to tools created by Canva, Google or Adobe.
“Emu Edit is capable of free-form editing through instructions, encompassing tasks such as local and global editing, removing and adding a background, color and geometry transformations, detection and segmentation, and more.”
Meta says that Emu Edit has an advantage over other generative AI models on the market as it follows instructions to let users make precise edits – such as asking it to add text to objects or turn a mountainous background into a cityscape.
“Unlike many generative AI models today, Emu Edit precisely follows instructions, ensuring that pixels in the input image unrelated to the instructions remain untouched.”
Meta says Emu Edit has been trained to be as precise as possible thanks to its dataset of 10 million synthesised examples which include an input image, description of the task and targeted output image.
Extremely accurate editing
If it works as it’s supposed to, the accuracy of the programme will be impressive. Let’s say you have an image of a dog sitting on a grass lawn. Meta says you can ask Emu Edit to change the dog into a lion, and the AI will comply without a user needing to select the dog first.
That level of accuracy would help brands make quick, precise edits to images without the help of external resources – speeding up content creation time and improving task efficiency.
As these tools will be available for Facebook and Instagram, it’ll let marketers create AI content without needing to use third-party apps. Again, this will be a productivity boost.
“While certainly no replacement for professional artists and animators, Emu Video, Emu Edit, and new technologies like them could help people express themselves in new ways—from an art director ideating on a new concept or a creator livening up their latest reel to a best friend sharing a unique birthday greeting. And we think that’s something worth celebrating.”