Runway has shouldered apart Midjourney and Steady Diffusion, introducing the primary clips of text-to-video AI artwork that the corporate says is totally generated by a textual content immediate.
The corporate mentioned that it’s providing a waitlist to hitch what it calls “Gen 2” of text-to-video AI, after providing an analogous waitlist for its first, easier text-to-video instruments that use a real-world scene as a mannequin.
When AI artwork emerged final 12 months, it used a text-to-image mannequin. A consumer would enter a textual content immediate describing the scene, and the software would try to create a picture utilizing what it knew of real-world “seeds,” creative types and so forth. Providers like Midjourney carry out these duties on a cloud server, whereas Steady Diffusion and Steady Horde benefit from related AI fashions working on house PCs.
Textual content-to-video, nonetheless, is the subsequent step. There are numerous methods of engaging in this: Pollinations.ai has accrued just a few fashions which you’ll be able to check out, certainly one of which merely takes just a few associated scenes and constructs an animation stringing them collectively. One other merely creates a 3D mannequin of a picture and means that you can zoom round.
Runway takes a distinct strategy. The corporate already affords AI-powered video instruments: inpainting to take away objects from a video (versus a picture), AI-powered bokeh, transcripts and subtitles, and extra. The primary technology of its text-to-video instruments allowed you to assemble a real-world scene, then use it as a mannequin to overlay a text-generated video on prime of it. That is usually completed as a picture, the place you would take a photograph of a Golden Retriever and use AI to remodel the picture into a photograph of a Doberman, for instance.
That was Gen 1. Runway’s Gen 2, as the corporate tweeted, can use current photographs or movies as a base. However the know-how also can fully auto-generate a brief video clip from a textual content immediate and nothing extra.
As Runway’s tweet signifies, the clips are each brief (only a few seconds at most), awfully grainy, and suffers from a low body charge. It’s not clear when Runway will launch the mannequin for early entry or common entry, both. However the examples on the Runway Gen 2 web page do present all kinds of video prompts: pure text-to-video AI, textual content+picture to video, and so forth. It seems that the extra enter you give the mannequin, the higher your luck. Making use of a video “overlay” over an current object or scene appeared to supply the smoothest video and highest decision.
Runway already affords a $12/mo “Customary” plan that permits for limitless video initiatives. However sure instruments, comparable to really coaching your personal portrait or animal generator, require an extra $10 payment. It’s unclear what Runway will cost for its new mannequin.
What Runway does show, nonetheless, is that in just a few brief months, we’ve moved from text-to-image AI artwork into text-to-video AI artwork… and all we will do is shake our heads in amazement.