Posted by z3d in ArtInt

On Tuesday, Stability AI released Stable Video Diffusion, a new free AI research tool that can turn any still image into a short video—with mixed results. It's an open-weights preview of two AI models that use a technique called image-to-video, and it can run locally on a machine with an Nvidia GPU.

Last year, Stability AI made waves with the release of Stable Diffusion, an "open weights" image synthesis model that kick started a wave of open image synthesis and inspired a large community of hobbyists that have built off the technology with their own custom fine-tunings. Now Stability wants to do the same with AI video synthesis, although the tech is still in its infancy.

Right now, Stable Video Diffusion consists of two models: one that can produce image-to-video synthesis at 14 frames of length (called "SVD"), and another that generates 25 frames (called "SVD-XT"). They can operate at varying speeds from 3 to 30 frames per second, and they output short (typically 2-4 second-long) MP4 video clips at 576×1024 resolution.

In our local testing, a 14-frame generation took about 30 minutes to create on an Nvidia RTX 3060 graphics card, but users can experiment with running the models much faster on the cloud through services like Hugging Face and Replicate (some of which you may need to pay for). In our experiments, the generated animation typically keeps a portion of the scene static and adds panning and zooming effects or animates smoke or fire. People depicted in photos often do not move, although we did get one Getty image of Steve Wozniak to slightly come to life.

2

Comments

You must log in or register to comment.

There's nothing here…