mishu2
7 hours ago
Having the ability to do real-time video generation on a single workstation GPU is mind blowing.
I'm currently hosting a video generation website, also on a single GPU (with a queue), which is also something I didn't even think possible a few years ago (my show HN from earlier today, coincidentally: https://news.ycombinator.com/item?id=46388819). Interesting times.
iberator
6 hours ago
Computer games have been doing it for decades already.
echelon
41 minutes ago
I think video-based world models like Genie 2 will happen and that they'll be shrunken down for consumer hardware (the only place they're practical).
They'll have player input controls, obviously, but they'll also be fed ControlNets for things like level layout, enemy placement, and game loop events. This will make them highly controllable and persistent.
When that happens, and when it gets good, it'll take over as the dominant type of game "engine".
arghwhat
5 hours ago
A very, very different mechanism that "just" displays the scene as the author explictly and manually drew it, and yet has to pull an ungodly amount of hacks to make that viable and fast enough, resulting in a far from realistic rendition...
This on the other hand happily pretends to match any kind of realism requested like a skilled painter would, with the tradeoff mainly being control and artistic errors.
echelon
39 minutes ago
> with the tradeoff mainly being control and artistic errors.
For now. We're not even a decade in with this tech, and look how far we've come in the last year alone with Veo 3, Sora 2, and Kling 4x, and Kling O1. Not to mention the editing models like Qwen Edit and Nano Banana!
This is going to be serious tech soon.
I think vision is easier than "intelligence". In essence, we solved it in closed form sixty years ago.
We have many formulations of algorithms and pipelines. Not just for the real physics, but also tons of different hacks to account for hardware limitations.
We understand optics in a way we don't understand intelligence.
Furthermore, evolution keeps evolving vision over and over. It's fast and highly detailed. It must be correspondingly simple.
We're going to optimize the shit out of this. In a decade we'll probably have perfectly consistent Holodecks.
nkmnz
5 hours ago
Bob Ross did it, too.
pwython
3 hours ago
1 frame of Bob Ross = 1,800s