joshstrange
5 hours ago
> RAM prices are crashing because new models won’t need as much
Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.
RAM prices are most certainly not crashing (yet) and treating it as a forgone conclusion because _one_ lab found gains could be made and hasn't even reported on the efficiency of their method is just irresponsible. It's almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite.
[0] https://pcpartpicker.com/trends/price/memory/
[1] https://tech.sportskeeda.com/gaming-news/how-google-s-new-tu...
amelius
5 hours ago
> Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined.
I think it is determined:
woadwarrior01
5 hours ago
Yeah, even if one efficiency trick lands, people will end up spending the saved budget right back on bigger models, and/or more "thinking" tokens.
EthanHeilman
2 hours ago
Not if the bigger models have diminishing returns. Lets say you figure out a way to reduce RAM requirements 100X, but 2x increasing RAM usage by 2x only gets you a 1% increase in effectiveness and 3x does not get you any noticeable increase over 2x at all. Sure you can reduce the price per token, but you might have already saturated the market. Even if you haven't saturated the market, your hardware based moat just got smaller and this is going to reduce your margins even more.
Just noticed that pydry made a similar point: https://news.ycombinator.com/item?id=47574216
pydry
5 hours ago
Jevons paradox only applies if demand hasnt already been saturated.
The fact that public LLM usage is leveling off at a price of $0 and Jensen "we make the shovels in this gold rush" Huang is rather desperately claiming that you need to spend $250k/year in tokens to be taken seriously suggests that demand saturation may not be that far off.
Whether Jevons' Paradox applies to software engineers I think is another open question. Im constantly being told that it doesnt and that LLMs make half of us redundant now, but Im skeptical - so much automation I see is broken or badly done.
raincole
4 hours ago
It is quite hard to imagine how the demand is saturated now. I think any company that uses a sliver of AI will happily increase their token consumption 100x if it's free.
flir
3 hours ago
Are you assuming a brute force "burn tokens until it passes the tests" model, or is there a really sweet approach on the horizon that is impractical at current token costs?
I'm asking 'cos while I'm philosophically opposed to the first option, but I'd love to hear about anything that resembles the second.
SpicyLemonZest
3 hours ago
One idea I've heard is prototype-first design reviews. If the cost of code genuinely trends to zero, there's no reason why most technical disagreements about product functionality couldn't come with prototypes to illustrate each side of the debate. Today, that's not always practical between token costs and usage limits.
pydry
4 hours ago
Executive FOMO disease is being exploited by the model providers to push for maximal token usage even when it is pointless.
This includes encouraging people to set up elaborate multi model set ups (e.g. "gas town") for coding that do not meaningfully improve productivity but which certainly do cause token usage to explode.
It also includes encouraging execs to use token consumption as a proxy for productivity - almost akin to SLOC.
AI has a halo right now and the managerial class seem to be willing to forgive almost any failure because the promise is so enticing. We're at peak expectations right now. They will soon start to be less forgiving when the warts which are intrinsic to LLMs remain unsolved.
monknomo
2 hours ago
nobody know how to measure software productivity + ai is supposed to mean productivity goes up = more ai means more productivity
As best as I can tell, that's the thinking. It's one number, it's very easy to find and manage, and there is a belief that it directly measures productivity.
I disagree that it does; seems to me the throughput of useful features is a better measure, but I'm not in the drivers seat on this one
irke
an hour ago
Incremental revenue and cost-savings, at least for enterprises, is where it would show up. There’s also a present value consideration - if LLM’s make those dollars come into existence closer to the present, they are worth more.
The personal use case stuff is messy and subjective.
monknomo
an hour ago
attributing incremental revenue to gross engineering effort is challenging, imo.
Cost savings is primarily a function of headcount here. Which is also easy to measure, and so if we take my thesis that easy to measure stuff is prioritized...
irke
3 hours ago
Yep - it’s impossible to separate experimental tokens vs value creating ones.
Ultimately the performance will be assessed via the income statement and cash flows of customers of the model producers.
Frankly in the window pre-IPO it’s in the best interests of OAI et al to show a line going to the top-right in relation to tokens, in their prospectus. What does that mean?
Strategic manipulation.
Marha01
3 hours ago
Demand for top models is definitely not saturated, at least when it comes to programming. If I could afford to use 5x more Claude Opus 4.6 tokens, I would!
hajile
2 hours ago
Demand is relative. How many Claude tokens would you buy if they had a 10x price hike?
The market has achieved it's current saturation level with loss-leader prices that remind me of the Chinese bike share bubble[0]. Once those prices go up to break even levels (let alone profitable levels), the number of people who can afford to pay will go down dramatically (and that's not even accounting for the bubble pop further constricting people's finances).
pigpop
an hour ago
If they've already built themselves a loyal customer base (which is usually the point of fighting a price war) and the customers are happy with the technology they have, then if funding is tight and turning a profit is more important why wouldn't they pivot to optimizing inference by stopping further training, freezing the model versions, burning the weights into silicon and building better caching strategies and improving harnesses and tools that lower their cost and increase their margin?
If all they do is hike prices then they'll lose customers to competitors who don't or who find a way to serve a similar model cheaper.
The demand isn't going to go away purely through higher prices. Once people know something is possible they will demand it whether supply is constrained or not. That's a huge bounty for anyone who can figure out how to service that demand.
vonneumannstan
2 hours ago
Pretty sure the entire markets for Storage, HBM, DDR5, etc are completely sold out for next several years. How is that saturated?
adventured
4 hours ago
LLMs haven't remotely begun to be integrated into the lives of the typical person. Not even close. The typical person is using LLMs not at all as it pertains to their daily life tasks. They're using them almost entirely for limited discussion matters (eg having a discussion with GPT about a medical issue, or a work related matter).
This is the first or second inning in the LLM rollout. It'll take 15-20 more years for full integration of AI agents into the life of the typical person.
The claw experiments for example can just barely be considered alpha stage. They're early AI garbage unfit for the average person to utilize safely. That new world hasn't gotten near the typical person yet.
The compute requirements to get to full integration of AI agents into the life of the average person - billions of them - is far beyond 10x where we're at now.
pizlonator
4 hours ago
> LLMs haven't remotely begun to be integrated into the lives of the typical person. Not even close. The typical person is using LLMs not at all as it pertains to their daily life tasks. They're using them almost entirely for limited discussion matters
This is an argument in favor of demand having leveled off.
pigpop
an hour ago
Only if nothing changes. Right now, people are running agent frameworks like OpenClaw on their own hardware or a VPS and the frameworks are often single person projects. This results in all sorts of problems but you can pick an easy solution from history which is to create a walled garden service for running these agents where you can provide security and standardization. If that platform also allows trusted services to integrate then they can provide end to end security guarantees. They also benefit from improvements to the models themselves making them more difficult to subvert. Creating something that is secure enough for the average person to entrust their credit card to is not an impossible task.
pydry
4 hours ago
>The typical person is using LLMs not at all as it pertains to their daily life tasks.
This doesnt track at all with my experience. Everybody is using it everywhere.
Moreover people are using them for daily life tasks even when it is not an appropriate use of LLMs - e.g. getting medical advice as you referred to or writing emails which are clearly pissing off their coworkers.
In this respect I see it as akin to radium - a new technology that got a little too fashionable for its own good when it first emerged and which will likely have many use cases scaled back.
TheScaryOne
2 hours ago
>Everybody is using it everywhere.
No one in our Auto shop is using AI. One of the new diagnostic tools was demo'd with AI, and none of us were having it. It's about as accurate as Googling your symptoms.
My mother had an AI powered lung scan that came back with Stage 4 Cancer. The Oncologist got called in (for a fee!) to tell us it was just early stage COPD.
user34283
2 hours ago
In my experience people vastly overestimate the competence of doctors. Getting medical advice from LLMs could be life saving.
Personally I experienced this when a specialized doctor believed a drug interaction to be the opposite, thinking A hinders the absorption of B, when actually it hinders the clearance, tripling concentration of B.
Without AI, I would have been clueless about this and could not have spotted the mistake. I don't know if it would truly have been critical, but it did shake my confidence in doctors.
kmeisthax
4 hours ago
I thought we were going to hit token saturation years ago, but they keep inventing new ways to use tokens. Like, instead of asking a chat model to write something and getting ~1000 tokens out of it, you now have an agent producing ~10,000 tokens - or, worse, spawning 10 subagents that collectively burn ~100,000 tokens. All for marginally better answers with significantly higher compute usage.
Personally, I would have used all those tokens to generate synthetic data for IDA (iterated distillation and amplification) so that the more efficient 1000 token/answer chat model can answer more questions, but apparently that doesn't justify an insane datacenter buildout.
azinman2
3 hours ago
Everyone is interested in using less tokens to accomplish the same task.
user34283
2 hours ago
Marginally better answers?
Claude Code and co. can now analyze an enterprise codebase to debug issues in a system with multiple services involved.
I don't see how that would have been possible at all in the past.
Analemma_
3 hours ago
We’re not even close to demand saturation with tokens. Have you seen the people rending their garments with rage that Anthropic and Google won’t let them use their flat-rate subscriptions to burn millions of tokens per hour on OpenClaw? And that’s a tiny set of die-hard tinkerers.
The ceiling of token use when everyone has something akin to OpenClaw just running as a background process on their phone is way higher than there’s supply for right now. Jevons paradox is still in full force.
Macha
an hour ago
Is that not appealing to those users _because_ its a subsidised flat rate? Like those users could go and swap to API pricing right now if they wanted to, but at API pricing they don’t want to
fotcorn
5 hours ago
Also, there is zero reason to think that the big labs did not have anything similar to TurboQuant for a long time already.
The recent blog post from Google announcing TurboQuant does not change anything regarding RAM planning for the big labs.
TurboQuant itself is already a year old! So even smaller labs have probably seen and implemented it.
scw
4 hours ago
TurboQuant has a specific benefit by compressing the KV cache at a negligible cost to quality. That mainly means that the context lengths can go up in models for the same amount of memory, however the KV cache only accounts for something like 20% of the overall model size, and this will not dramatically decrease memory demands in the way that some of the more sensationalist reporting has stated.
lostmsu
3 hours ago
In large providers KV caches are the main bottleneck, no?
schmidtleonard
5 hours ago
The open source tooling got quantization support 3 years ago! It was a lesser type of quantization, but more than enough to prove that the savings just go to bigger models.
adjejmxbdjdn
5 hours ago
I’m not disagreeing with you, but consumer RAM prices are lagging indicators. If commercial RAM prices are dropping then consumers will see those price drops last, especially given the fact that several consumer manufacturers turned to commercial only.
drakythe
5 hours ago
Is there a source that says commercial RAM prices are dropping? I was recently told (without a source, so I am not sure if it is true or not) that OpenAI never even bought any of the RAM they signed deals on last year, and that those deals were just letters of intent. So if prices are coming down I wouldn't be shocked but the economy is pretty well vibe coded these days so who even knows.
ffsm8
3 hours ago
Well, all manufacturers of ram have publicly stated that they're sold out for 2026
RAM prices falling during 2026 is insanely unlikely unless AI crashes so hard it starts to actually kill companies. And not just any but big tech
I'm not seeing that in 2026. Maybe 2027 (I'd sincerely doubt that too, honestly), but definitely not within the next 9 months. Their runway is _way_ too large for things to spiral out of control within such a short period of time
dylan604
2 hours ago
If the claims the GP made about letters of intent to buy vs actual purchases are true, that brings additional questions. Like, if you send a letter of intent but do not follow through, are there financial penalties? How hard is it for the chip maker to sell the chips allotted based on that letter of intent? Would someone like Apple buy up the extra, or would they not need it as they've already bought enough for the units they expect to sell? If someone like Apple suddenly had an influx of RAM, that does not mean they would have extra CPU capacity to match. If the supply chain is this closely apportioned, what is the most likely result of a sudden surplus?
citrin_ru
3 hours ago
> unlikely unless AI crashes so hard it starts to actually kill companies. And not just any but big tech. I'm not seeing that in 2026
A month ago AI crash we looking unlikely but with the strait of Hormuz being de-facto blocked many predict a global stagflation which could affect AI too.
sergiotapia
3 hours ago
here you go: https://x.com/wccftech/status/2037921057097892018
Ram prices are dropping
itintheory
2 hours ago
Every response to the original post calls it out as being factually incorrect...
sergiotapia
2 hours ago
ToucanLoucan
5 hours ago
If they see them. Plenty of businesses are still charging pandemic prices for all kinds of goods and simply pocketing the difference.
Cars come to mind instantly. Prices exploded in 2020/1, due to legitimate shortages, most of which have been plus or minus resolved, but the prices for new (and used!) cars never came back down.
mono442
an hour ago
Actually the prices for new cars seem to be now lower than in 2022 where I live in Europe. Though this could be attributed as well to the competition from Chinese manufacturers.
busterarm
3 hours ago
While the pandemic chip shortage resolved around 2024, a new chip shortage started in 2025 when the Dutch government took control of Nexperia (who are owned by China's Wingtech) and China retaliated by creating export restrictions. Honda, Nissan, Mercedes-Benz and others cut production. With less inventory, manufacturers and dealers are raising prices to compensate.
Also the cost of shipping never came down and lots of cars and/or their components need to cross oceans. Plus we have a new energy crisis...
slfnflctd
5 hours ago
> almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite
To be fair, they got it from us. This happened to me plenty of times long before modern LLMs.
throwup238
4 hours ago
It learned by reading HackerNews, after all.
layer8
4 hours ago
I agree. The article they link to talks about memory company stocks crashing, not RAM prices crashing. There is some truth to the former: https://www.ft.com/content/e4e15692-187e-4466-832e-ec267e792...
h14h
5 hours ago
I do wonder how closely prices consumer RAM kits follow the wholesale prices for NAND chips manufacturers see internally. The pcpartpicker graphs you linked show consumer prices have leveled out and may even be starting to fall. Depending on how the economics shake out this could mean we've hit an inflection point.
My personal prediction is that once the VC bill comes due and prices for frontier models starts to climb, competition for efficiency will heat up. The main AI use-cases seem to be falling into buckets, and I doubt serving gigantic, do-it-all general models for every use-case under the sun is remotely cost-effective.
If common use-cases start to be more efficiently served by smaller, more efficient purpose-built models (or systems thereof), it'd make the big frontier models increasingly niche. Cursor's Composer 2 model is a great example of this.
In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.
joshstrange
4 hours ago
Consumer vs NAND is an absolutely fair distinction to make, I'm not sure how to track those prices. My main issue the article saying "RAM prices are crashing" (which I can't find any evidence of) and linking to an article that doesn't even repeat that claim, it instead just speculates that maybe RAM will come down in price due to this new idea.
> In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.
I sure hope so. RAM, HDDs, and SSDs are all crazy-high right now and I was in the market for literally all 3 but have paused all my buying because I can't justify the costs as they stand today.
h14h
40 minutes ago
> My main issue the article saying "RAM prices are crashing" (which I can't find any evidence of) and linking to an article that doesn't even repeat that claim
That's totally fair. The article is written in a very odd way where it makes a bunch of authoritative, factual-sounding claims and then throws a "this is all very speculative" line right at the end.
It's very interesting speculation, but can't really be considered anything more than that, despite the prose it chose.
martinvol
4 hours ago
RAM prices haven't crashed yet and it'll take time because it has to propagate within the supply chain. Micron is -20% from the top already https://www.investing.com/equities/micron-tech
Stock price is the best forward indicator I can think of
cwillu
4 hours ago
That might be true, but it's still straightforwardly wrong to say that RAM prices have crashed, and it calls into question everything else they write.
martinvol
4 hours ago
yeah good point, although it's just one of all the catalysers I mentioned. I fact I had written most of the post already before I saw the news about RAM.
am17an
3 hours ago
Thank you, there are two things I would like to point out:
1) Google releasing something probably means they don't see it as important. 4-bit KV-cache quantization has been known for a long time. The fact there is almost a mass hysteria about this paper makes me think there is a lack of skepticism in this AI mania, even in relatively tech-savvy crowd.
2) But prices for memory companies are crashing! look around, the whole market is crashing.
albinn
5 hours ago
I would think that we are going to see RAM prices increase even more, given, among other things, pure helium disruptions and increased electricity prices.
I haven't looked closely into TurboQuant, but perhaps it will revolutionize just as much as the 1-bit llm did...
aurareturn
4 hours ago
Even if TurboQuant, which was released a year ago, drastically lower RAM requirements, AI labs will just release bigger models.
Jevons Paradox. When are we going to learn that efficiency gains in AI does not decrease hardware usage?
functional_dev
4 hours ago
valid point, it reminds me of video games. GPUs got faster, devs pushed higher resolutions, more complex lighting instead of saving power :)
maeln
3 hours ago
> > RAM prices are crashing because new models won’t need as much
> Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.
I find it fascinating how extremely reactive things have become. One research paper which, to my knowledge, hasn't been externally replicated yet, nor implemented, generate tons of hyperbolic article, tweets and such, and do actually manage to move the market at least temporarily. Not just this, but a simple message in full caps lock by the president of the U.S who is in the habit of lying through is teeth constantly, and the same thing happens. It's like there is a big bubble that threw any form of critical thinking out of the window and is in a hurry to react to anything even if it is not even remotely believable. Now I understand why it happens, there is a lot of money that can be made by capitalizing on FOMO, either by driving traffic to their website, socials, etc, or by simply insider trading (which feels like it has been legalized these days). But I still find it incredible the proportion it started to take.
JCTheDenthog
3 hours ago
My favorite was when Google revealed Project Genie a month ago (which lets you generate video game worlds with AI, basically) and stocks for game companies immediately dropped. Anyone familiar with games and gaming knows that what Project Genie offers (essentially empty worlds with minimal interactivity that you can just kind of wander around in, and they struggle with simple things like object permanence if you look away) knows that this isn't real competition for actual games, but the markets reacted anyways.
incognition
3 hours ago
You nailed it. It's algos and noise trading
gmerc
4 hours ago
consumer ram is starved by production capacity shifting to HBMs. Hbms dropping in price would not affect consumer RAM on any immediate timeline. Also, as pointed out by many, Jevons Paradox
faangguyindia
5 hours ago
If the gains are real why the limits are so bad? Google can barely serve Anti-gravity.
owlmirror
5 hours ago
Isn't that at the moment still a free product? Of course they will not prioritize serving those requests. That tells you nothing.
butlike
5 hours ago
It tells you there's no clear path to monetization.
adventured
4 hours ago
They've all avoided loading up their LLMs with ads to this point. That is going to change dramatically over the next 2-3 years. All of them will be loaded with ads, and Google will partake as expected given their ad network & capabilities in that realm. They'll match GPT's ad roll-out.
notatoad
3 hours ago
it has a paid option. and the antigravity subreddit is full of people who claim to be paid users, complaining about constantly hitting limits.
BoredPositron
5 hours ago
You get more Claude tokens from a Google subscriptions via antigravity than from anthropic. Especially if you use the 5 other "family" accounts you can share the subscription with...
ajross
5 hours ago
> Reality begs to differ
Honestly you're both wrong. RAM prices spiked speculatively, and they're going down for the same reason. Market people always want to argue in fundamentals, when in practice *ALL* the high frequency components of the signal are down to a bunch of traders trying to guess where it's going in the short term.
At best those guesses are informed by ground truth ("AI needs a lot of RAM!" "Sam cornered the marked!" "TurboQuant needs less RAM!"), but they remain guesses, and even then you can't tell the difference between that and random motion.
cma
5 hours ago
> RAM prices spiked speculatively
Didn't OpenAI buy up 40% of the capacity all at once?
ajross
4 hours ago
No, they signed a bunch of contracts for future deliveries. That's not a supply constraint. The factories making RAM continued operating and serving their existing deliveries, and in fact they still are.
Freshman economics would say that supply is fine and that prices shouldn't move. But they did anyway. And the reason is speculation.
leoc
4 hours ago
I don't get it tbh. What market participants were speculating here? There aren't futures markets in RAM as far I know, though I certainly don't know much. And the supply constraints appear to have been pretty real (though maybe not immediate) if eg. Valve was begging publicly for RAM consignments. Were there pure-play speculators filling warehouses with DDR5?
notatoad
3 hours ago
>There aren't futures markets in RAM as far I know
sure there is. not formally, but if you hold a contract for x units of future production, you can sell that contract to somebody else who wants those units more than you do.
irke
2 hours ago
That’s a forward contract yeah. They def do exist.
Futures are standardised forward contracts traded on exchanges
drakythe
4 hours ago
The economy is vibe coded at this point.
Have we gotten anymore word on the potential Helium constraints that SK Hynix was making noise about after the strike on the helium plant in the Middle East that suppplied 60% of S. Korea's Helium? Because that could definitely put a kink in things, since SKH is one of the 3 remaining big DRAM producers.
Forgeties79
5 hours ago
I’ll believe they’re going down when it doesn’t cost $550 for the $105 ram I purchased 1 year ago. Yes consumer prices lag commercial prices yada yada, I think any hot takes are pointless until we see lower prices or far more convincing evidence it’s coming. When it costs basically a MacBook neo for 32gb of DDR5 ram it’s hard to hear “ram is coming down for sure”
hirako2000
5 hours ago
Not crashing yet. The article is looking 1 to 5 years to come.
Given Nvidia's CEO's agitation I would give credit to the prediction, and if it's correct the price will go back to what it was, or even lower of investment in capacity are made today.
michaelcampbell
5 hours ago
My take is new capabilities will consume any price reductions, making them moot. At least in the medium term.
A RAM price drop due to some magic efficiencies assumes everything else doesn't change, which I doubt anyone honestly thinks will be the case.
mNovak
2 hours ago
This feels similar to when Deepseek first debuted with claims of ultra-low cost training, and all the pundits exclaimed that Nvidia was finished, the bubble had burst, etc.
sigmoid10
5 hours ago
Yeah, I also stopped reading at that point. If I want a bunch of random, made up facts to sell lukewarm opinions or steer the uneducated masses, I'll tune in on a Trump press conference. Why does this feel like someone is desperately trying to make reality mirror his flailing market bets?
Forgeties79
4 hours ago
Sometimes it's real easy to see who has risky short positions right?
sandworm101
5 hours ago
There is also demand for ram in others areas of data centers. As we are all pushed deeper into clouds, i can see the rise of ram for data storage (ram drives) continue to eat into the supply. A module of ddr5 will be more useful in a netflix rack streaming movies 24/7 than in a gaming PC where it may only be used an hour or two every day.