Warning: Some posts on this platform may contain adult material intended for mature audiences only. Viewer discretion is advised. By clicking ‘Continue’, you confirm that you are 18 years or older and consent to viewing explicit content.
AI images can be shockingly good. AI animation… sucks. That’ll change. There’s too much training data not to. Every minute of video is hundreds of adjacent frames to tell the machine what can happen between adjacent frames. But right now, it’s either fuzzy and bad, or clean and worse, and I cannot comprehend how anyone saw these and said “that’ll do.”
Just pick a good frame and wiggle the parts in Live2D or something.
Just pick a good frame and wiggle the parts in Live2D or something.
The hilarious part is that hoyo is constantly pushing the boundaries of what can be done with live2d; it’s heavily used in Genshin character teasers, and their otome game uses it extensively. They’re really good at this. Why get AI involved?
Each existing frame of video, especially older video, contains a limited amount of information. You can maybe do some static image upscaling – and AI upscaling is actually pretty remarkable. I was blown away by what Stable Diffusion could do with some old comic book scans.
But more than that…there’s a whole video of video of the characters and scenes. For most of the video, that information can, given the right software and a 3d model, be incorporated back into frames to generate a higher-resolution image.
To say nothing of frame interpolation to generate higher-frame-rate video.
Like, I like Lawrence of Arabia. That movie actually has pretty good-quality footage. But…there’s still film grain. And the frame rate is only so high. But there is a whole lot of footage of Lawrence in that movie, enough information to do a pretty good job, if used effectively, of dropping film grain, generating intermediate frames, and increasing the resolution.
Like, I like Lawrence of Arabia. That movie actually has pretty good-quality footage. But…there’s still film grain. And the frame rate is only so high. But there is a whole lot of footage of Lawrence in that movie, enough information to do a pretty good job, if used effectively, of dropping film grain, generating intermediate frames, and increasing the resolution.
This is possible today, and without much effort. Most Stable Diffusion kits just come with upscalers and, as long as you pick the right ones for the job, the models act like fucking magic. Way way better than any of the “nearest neighbor” algorithms image editors provide.
Video editors already have really good tools for interpolating frames for slow motion. They are a bit fiddly in high motion situations, but work well otherwise.
You can do upscaling with AI upscalers in SD today, yeah, and it’s pretty nifty, but it’s working with a 2D model. That’s nice if you have a lot of footage of Lawrence from exactly the same angle; if you train a model on the whole video, then you can use that for upscaling individual frames.
But my point is that if you have software that’s smart enough to make use of information derived with a 3D model, then you don’t need to have that identical angle to make use of the information there.
Let’s say that you’ve got a shot of Peter O’Toole like this:
But add a 3d model to the thing, and you can use data from the close-up in the first image to scale up the second. The software can rotate the data in three dimensions, understand the relationships. If you can take time into account, you could even learn how his robe flaps in the wind or whatnot.
My point is that if all you are doing is cleaning up frames and trying to upscale footage from 24fps to 60fps, you have all of the data you need from the previous/next frames to blend those into in-between frames. A model trained on the movie would help, but there’s no need to get into anything as complex as 3D models of objects. Sub-second animation data is just fine.
AI images can be shockingly good. AI animation… sucks. That’ll change. There’s too much training data not to. Every minute of video is hundreds of adjacent frames to tell the machine what can happen between adjacent frames. But right now, it’s either fuzzy and bad, or clean and worse, and I cannot comprehend how anyone saw these and said “that’ll do.”
Just pick a good frame and wiggle the parts in Live2D or something.
The hilarious part is that hoyo is constantly pushing the boundaries of what can be done with live2d; it’s heavily used in Genshin character teasers, and their otome game uses it extensively. They’re really good at this. Why get AI involved?
Trying makes sense. Failing makes sense. Shipping anyway does not make sense.
It has the possibility of looking better than Live2D in the future. Start small, build to big.
https://nitter.net/Yokohara_h/status/1707393272862388546
https://nitter.net/TDS_95514874/status/1711332968252477814
https://nitter.net/TDS_95514874/status/1708974640817930328
https://nitter.net/TDS_95514874/status/1711215358600945824
I’m looking forward to superresolution in video.
Each existing frame of video, especially older video, contains a limited amount of information. You can maybe do some static image upscaling – and AI upscaling is actually pretty remarkable. I was blown away by what Stable Diffusion could do with some old comic book scans.
But more than that…there’s a whole video of video of the characters and scenes. For most of the video, that information can, given the right software and a 3d model, be incorporated back into frames to generate a higher-resolution image.
To say nothing of frame interpolation to generate higher-frame-rate video.
Like, I like Lawrence of Arabia. That movie actually has pretty good-quality footage. But…there’s still film grain. And the frame rate is only so high. But there is a whole lot of footage of Lawrence in that movie, enough information to do a pretty good job, if used effectively, of dropping film grain, generating intermediate frames, and increasing the resolution.
This is possible today, and without much effort. Most Stable Diffusion kits just come with upscalers and, as long as you pick the right ones for the job, the models act like fucking magic. Way way better than any of the “nearest neighbor” algorithms image editors provide.
Video editors already have really good tools for interpolating frames for slow motion. They are a bit fiddly in high motion situations, but work well otherwise.
You can do upscaling with AI upscalers in SD today, yeah, and it’s pretty nifty, but it’s working with a 2D model. That’s nice if you have a lot of footage of Lawrence from exactly the same angle; if you train a model on the whole video, then you can use that for upscaling individual frames.
But my point is that if you have software that’s smart enough to make use of information derived with a 3D model, then you don’t need to have that identical angle to make use of the information there.
Let’s say that you’ve got a shot of Peter O’Toole like this:
https://prod-images.tcm.com/Master-Profile-Images/lawrenceofarabia1962.4455.jpg?w=824
And another like this:
https://media.vanityfair.com/photos/52d691da6088e6966a000006/master/w_2240,c_limit/1389793754760_lawrencethumb.jpg
Those aren’t from the same angle.
But add a 3d model to the thing, and you can use data from the close-up in the first image to scale up the second. The software can rotate the data in three dimensions, understand the relationships. If you can take time into account, you could even learn how his robe flaps in the wind or whatnot.
One would need something like this.
My point is that if all you are doing is cleaning up frames and trying to upscale footage from 24fps to 60fps, you have all of the data you need from the previous/next frames to blend those into in-between frames. A model trained on the movie would help, but there’s no need to get into anything as complex as 3D models of objects. Sub-second animation data is just fine.