I thought about it and here's my opinion:
There are two (three technically) ways that AI can be used.
> 1. renting gpu instances per minute from (you mention Google cloud) but I feel like some other providers can be cheaper too since new companies are usually cheaper, We get the lowendhosting of AI nowadays is usually via a marketplace-like thing (vast,runpod,tensordock)
Now vast offers serverless per minute AI models so checking it for something like https://vast.ai/model/deepseek-v3.2-exp or even glm 3.6 basically every of these turns out to be $.30 cents/minute or 18$ per hour
As an example GLM 4.6/ (now 4.7) have a YEARLY pricing of around 30 bucks iirc so now compare the immense difference in pricing
2. Using something like openrouter-based pricing :- Then we are basically on the same model of pricing similar to Google Cloud.
Of course AI models are reaching frontier and I am cheering for them but I feel like long term/even short term, these are still pretty expensive (even something like openrouter imo)
Someone please do genuine maths about this and I can be wrong, I usually am but I expect a 2-3x price (conservative side of things) increase if things arent subsidized
These are probably 10s of billions of dollars worth of gpu's so I assume that they would be barely profitable on the current rate but they get around 100s of billions in some cases worth of tokens generations so they can probably work via the 3rd use case I mention
Now coming to the third point which I assume is related to the 2nd/1st is that usually, the companies providing these GPU computes provide such compute, usually they can make money via providing by large term contracts.
Even huggingface provides consulting services which I think is the biggest profit to them and Another big contender can probably be European GPU compute providers who can provide a layer of safety or privacy for EU companies.
Now, looks like I had to go to reddit to find some more info but (https://www.reddit.com/r/LocalLLaMA/comments/1msqr0y/basical...), checking appenz's comment which I might add here (the relevant parts)
The large labs (OpenAI, Anthropic) and Hyperscalers (Google, Meta) currently are not trying to be profitable with AI as they are trying to capture market share. They may not even try to have positive gross margins, although the massive scale limits how much they can use per inference operation.
Pure inference hosters (Together, Fireworks etc.) have less capital and are probably close to zero gross margins.
There are a few things that make all of this more complicated to account for. How do you depreciate GPUs (I have seen 3 years to 8 years), how do you allocate cost if you do inference during the day and train at night etc.
The challenge with doing this yourself is that the market is extremely competitive. You need massive scale (as parallelism massively reduces cost), you need to be very good in negotiating cheap compute capacity and you need to be cost-effective in your G2M.
Opinions are my own, and none of this is based on non-public information.
So basically all of these are probably running in zero/net negative turns and they require billions of $'s to be spent and virtually there isn't any moat/lock-in (and neither there has to be)
TLDR: no company right now is sustainable
The only use case I can see is probably consulting but that will go as https://www.investopedia.com/why-ai-companies-struggle-finan...
So I guess the only reasonable business feels to me is private AI for large businesses who genuinely need it for their business (once again the MIT study applies) but that usually wouldn't apply to us normal grade consumers anyway and would be actually really expensive but still private and would be so far off from us normal people.
TLDR: The only ones making money are/ are gonna be B2B but even those are gonna dwindle if the AI bubble bursts because imagine an large business trying to explain why its gonna use AI if 1) the MIT study shows its unprofitable and 2) the fear around using AI etc. and all the financial consequences that the bubble's explosion might cause
So that all being said, I doubt it. I think that these prices are only till the bubble lasts which is only as strong as its weakest link which is openAI right now with trillions promised and a net lose making company whose CEO said that AI market is in a bubble and whose CFO openly floats the idea that OpenAI should be bailed out by the US govt if need be
So yeah..... Honestly Even local grade gpu's are expensive but with the innovations of open weights models, I feel like they would be the way to go for 90% of basic use cases being run inside them and probably there are very few cases of moat (and I doubt the moat existing in the first place)