an0malous
an hour ago
This is why the AI companies are rushing to IPO. By the end of next year you’ll be running most of your AI on device. They have no moat, they’ve reached the limits of scaling, most of the magic can be distilled into smaller models, and they know it
hadlock
25 minutes ago
Qwen's ~30B-class models are genuinely good enough for use if you can find a machine with enough memory bandwidth to run them at 30-90 tokens/second. It's been extremely telling that Qwen stopped releasing 120b class models. At some point in the next 10 years (maybe 3?) someone is going to release an Opus 4.5 class 256B model you can run locally. Right now our engineers use about $800/mo worth of opus tokens; at that rate the ROI for local LLM is ~10 months
sealeck
40 minutes ago
Have we reached the limits of scaling? Sadly it appears that larger model still equals better model
mindwok
4 minutes ago
I think GPT 4.5 showed that there is indeed a practical limit we're close too. That was supposedly a high-trillions of parameter model that was deprecated almost immediately because it was slow, insanely expensive, and had questionable benefits over the smaller models. Though apparently the new Mythos and whatever GPT Spud is (if it wasn't 5.5) are back up in the high trillions.
mikestorrent
9 minutes ago
Well, let's not forget that text models are not the only models! Video models are much slower and need comparatively more resources, and all they can do even at that size is generate videos a few seconds long. Clearly a ton more work is going to go into those, and demand for them will probably increase as more creative tools get authored using them as a central part of the workflow. Low-res local rendering for preview might be a thing, but the lion's share of the work for high-res, near-realtime rendering is going to be done on huge clusters for a long time yet.
pixelready
29 minutes ago
I think there’s still an open question around are the ultra-large next-gen models worth it? For those of us without early access to Mythos, it’s hard to verify whether it’s been held back from the public due to actually being “too dangerously powerful to release yet” as implied or because the gains aren’t outpacing the costs.
stogot
35 minutes ago
It’s still diminishing returns yes? It isn’t Moore’s Law
ActorNightly
4 minutes ago
Very false.
I use small models exclusively. They aren't a replacement for large models. You need decent hardware to run those models efficiently, as smaller parameter models plain suck and are still slow on macbooks. And affordability of higher end hardware is very limited.
cat5e
25 minutes ago
Huzzah, they’ve lost their stranglehold. Viva la revolution!