jhy17632
5 hours ago
this is a big move. i think in general data centres are overblown, open source models that run on your laptop are getting better and better and more and more popular. it's discussed here: https://johnny588358.substack.com/p/why-ai-infra-is-overblow... tdlr: a lot of infrastructure spending might wind up being wasted funds. i'd imagine more cities/counties are going to be moving this way, whether thats really good or not, i dont know
jupr
5 hours ago
Is anyone selling pre-built machines specifically for consumers to have their own local models?
I think it would be a good business model.
I mean if you look back to how expensive computers were when they first arrived, accounting for inflation, it's basically the same price point.
I think it would be awesome if I could pay 5/6k on really nice hardware that my devices could query directly via a app.
The problem is local is still not the easiest thing to set up, and tackling the networking aspect is not straight forward.
This sort of architecture is democratized by design and I am rooting for the person that figures this out for the masses.
rvz
5 hours ago
That's why Microsoft and all of the OEMs announced localized AI models that run on the new NVIDIA CPU on laptops.
jupr
5 hours ago
I don't think these small models are really that powerful yet and I don't really like the direction of per device localized models baked in to the OS. To wimpy and untrustworthy.
I want claude power, in a box, at my house, for my entire family completely compartmentalized from my operating system.
JumpCrisscross
4 hours ago
> I want claude power, in a box, at my house
This is still a six-figure commitment, possibly high five.
jupr
2 hours ago
How do you know this and what does it really cost. Those numbers make no sense to me.
JumpCrisscross
an hour ago
> How do you know this and what does it really cost
The cost of RAM and size of models. For Kimi K2.6 you need 2TB RAM. That’s $40k with DDR5. If you want it to run at the speeds you’re accustomed to with Claude, you need HBM memory, which costs more.
Practically speaking, you need to sink $250k+ into a 8x B200 node. So yeah, 6 figures to run properly. High 5 figures if you’re okay with really slow responses.