Introducing Our V1 Video Model
Hi everyone!
As you know, our focus for the past few years has been images. What you might not know, is that we believe the inevitable destination of this technology are models capable of real-time open-world simulations.
What’s that? Basically; imagine an AI system that generates imagery in real-time. You can command it to move around in 3D space, the environments and characters also move, and you can interact with everything.
In order to do this, we need building blocks. We need visuals (our first image models). We need to make those images move (video models). We need to be able to move ourselves through space (3D models) and we need to be able to do this all fast (real-time models).
The next year involves building these pieces individually, releasing them, and then slowly, putting it all together into a single unified system. It might be expensive at first, but sooner than you’d think, it’s something everyone will be able to use.
So what about today? Today, we’re taking the next step forward. We’re releasing Version 1 of our Video Model to the entire community.
From a technical standpoint, this model is a stepping stone, but for now, we had to figure out what to actually concretely give to you.
Our goal is to give you something fun, easy, beautiful, and affordable so that everyone can explore. We think we’ve struck a solid balance. Though many of you will feel a need to upgrade at least one tier for more fast-minutes.
Today’s Video workflow will be called “Image-to-Video”. This means that you still make images in Midjourney, as normal, but now you can press “Animate” to make them move.
There’s an “automatic” animation setting which makes up a “motion prompt” for you and “just makes things move”. It’s very fun. Then there’s a “manual” animation button which lets you describe to the system how you want things to move and the scene to develop.
There is a “high motion” and “low motion” setting.
Low motion is better for ambient scenes where the camera stays mostly still and the subject moves either in a slow or deliberate fashion. The downside is sometimes you’ll actually get something that doesn’t move at all!
High motion is best for scenes where you want everything to move, both the subject and camera. The downside is all this motion can sometimes lead to wonky mistakes.
Pick what seems appropriate or try them both.
Once you have a video you like you can “extend” them - roughly 4 seconds at a time - four times total.
We are also letting you animate images uploaded from outside of Midjourney. Drag an image to the prompt bar and mark it as a “start frame”, then type a motion prompt to describe how you want it to move.
We ask that you please use these technologies responsibly. Properly utilized it’s not just fun, it can also be really useful, or even profound - to make old and new worlds suddenly alive.
The actual costs to produce these models and the prices we charge for them are challenging to predict. We’re going to do our best to give you access right now, and then over the next month as we watch everyone use the technology (or possibly entirely run out of servers) we’ll adjust everything to ensure that we’re operating a sustainable business.
For launch, we’re starting off web-only. We’ll be charging about 8x more for a video job than an image job and each job will produce four 5-second videos. Surprisingly, this means a video is about the same cost as an upscale! Or about “one image worth of cost” per second of video. This is amazing, surprising, and over 25 times cheaper than what the market has shipped before. It will only improve over time. Also we’ll be testing a video relax mode for “Pro” subscribers and higher.
We hope you enjoy this release. There’s more coming and we feel we’ve learned a lot in the process of building video models. Many of these learnings will come back to our image models in the coming weeks or months as well.
Thank you for being part of this journey with us - And have fun!
David