Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%

65 pointsposted 5 hours ago
by gainsurier

64 Comments

irthomasthomas

2 hours ago

Insane. 3 points behind opus on the artificialanalysis index.

Mimo cost ~$400 at the old price, so about $40 today. Opus cost ~$5000

That's over 100x cheaper, and just 3 points behind.

I can't wait to experiment with an llm consortium of 100 deepseek and mimo models. Crazy times.

Shut up and take my m̶o̶n̶e̶y̶ data!

Edit: Gemini on google search told me I could write strikethrough text on hn using <s>. Mimo told me it was unsupported and then went on to list some tags that are supported, like <b>bold</b>. I tried copy pasting the word in strikethrough from a word processor but it lost the format. I ended up using mimo in an agent shell wrapper to produce it, and copy pasting from the terminal worked for some reason.

noman-land

an hour ago

What did MiMo say?

irthomasthomas

an hour ago

Says its not supported and lists a few tags that are, like <b>bold</b>

Does this work: s̶t̶r̶i̶k̶e̶t̶h̶r̶o̶u̶g̶h̶

NitpickLawyer

3 hours ago

Since the 3rd party providers on openrouter have all converged on much higher prices in serving these models (both mimo and dsv4), there's obviously a question on how/why are they lowering the prices so much.

It's possible they've finally integrated cheap(er) chinese chips. It's also possible they're just subsidising inference for real-world usage data. Interesting either way.

Aurornis

3 hours ago

> there's obviously a question on how/why are they lowering the prices so much.

Same reason they release some of the models for free: They are trying to capture market share.

lordofgibbons

3 hours ago

The difference is that releasing the model for free doesn't have ongoing cost for the company. Providing cheap tokens is very expensive - specially if you don't have access to the latest transistor node chips. So I think the parent comment is right, there's something else at play allowing DS and Xiaomi to offer these nearly free tokens.

baq

3 hours ago

National security, training data

zrn900

3 hours ago

> how/why are they lowering the prices so much

Like I responded to someone else:

- Cheap electricity - Cheap, domestically produced GPUs - Efficiency research. (a lot of it from Deepseek's research)

Also, the Chinese government wants the AI to be as accessible as EVs so everyone will use it.

wg0

3 hours ago

That's deliberate. US AI companies have no chance of recouping even fraction of their valuations.

PS: Have not tried this but Deepseev4 Flash (not even Deepseekv4 Pro version) with set to "high" has pretty much Claud Opus 4.7 level of capabilities and is lightening fast and dirty cheap. Hours and hours of conversation barely costs few cents.

bel8

3 hours ago

DeepSeek Flash on high (not max) is a freak of nature indeed.

Very disproportionate intelligence-to-cost ratio.

I'm leveraging this temporary anomaly and using it as my coding workhorse.

fgonzag

an hour ago

The weights are open and when prices settle down again will be runnable with less than 10k of hardware.

I can easily run it in a 8 bit quant with the 4 x 48GB Radeon Pro W7900 GPUs I snagged for 2k each before the memory squeeze.

A 158B parameter model, especially in an architecture as efficient as DS4 is not that hard to drive currently if you got in before the craze, and will be relatively easy to drive with future hardware generations.

binary132

2 hours ago

what makes you so confident it’s temporary

Arcuru

3 hours ago

I am very happy with DSv4 for their price/performance but neither of them are comparable to Opus.

tartoran

3 hours ago

But they're overall a good thing for us consumers even if we'll never use these models, it forces the prices down all around.

surgical_fire

3 hours ago

I have been using DeepSeek API within Claude Code. So far it has been legitimately superior to Claude, and Codex that I used before.

passive

5 hours ago

I worked part time with MiMo 2.5-pro over the last month, and barely managed to use 500 Million of the 700 Million tokens I had allocated.

My plan was just upgraded to 38 BILLION tokens per month. That's at least 10X the tokens I've used in my entire agentic development so far.

I should probably downgrade my plan, but we'll see. :)

sisve

4 hours ago

Did you not get 38B units? And a token = 2.5 unit (cache hit) or up to 600 unis (cache miss)

passive

3 hours ago

Yeah, I think they did switch the unit type.

zrn900

3 hours ago

Yep. I also got stupefied after I logged in and saw how many tokens they stuffed into my account...

CachedaCodes

5 hours ago

These and the Deepseek ones that were were cost reduced recently are perfectly capable models for the vast majority of light work and more.

It's funny thinking the US companies are hiking prices and Chinese ones do the opposite, it's obviously an strategy, but pretty funny

MaxPock

4 hours ago

How are these "capacity constrained" Chinese companies running inference without Hoppers and Blackwells ?

lukax

4 hours ago

Huawei Ascend AI Accellerators. DeepSeek V4 model architecture was optimized for Chinese hardware.

martinald

3 hours ago

They can (not entirely sure how 'grey' market this is) either have subsidiaries outside of china (eg: singapore) that provide the inference and/or just rent it off the public gpu clouds.

dijit

4 hours ago

Making their own NPUs for inference probably, you don't have to buy NVidia for inference. Google doesn't.

throawayonthe

3 hours ago

interesting/funny: their off-peak rates apply 00:00-08:00 Beijing time, so nine-to-five for someone on the NA west coast :p

dubcanada

3 hours ago

China has a population of 1.4B, US is 349M. 0-8 Beijing time is their off-peak? How is that funny, that's literally how timezones work?

bel8

3 hours ago

It's funny, in a good way, because their off-peak times match perfectly the werstern peak demand.

prodigycorp

3 hours ago

How realistic is this:

Chinese models incidentally slurps up some terms that lead them to finding unflattering words that you wrote about the CCP in a random journal entry, or maybe a social media csv export. You go to China one day and are denied entry due to what you said.

Realistic or no? (yes i know the us is getting bad in re. to what you write online as well)

Models hosted in China are a siren call that I don't feel bad about resisting.

dubcanada

3 hours ago

This statement makes no sense, because you literally said the "US is getting bad". We already gave up all of our data, if you wrote something about the CCP you should already expect they know about it.

adrian_b

3 hours ago

This may be true about any models hosted by others than you.

At least the Xiaomi models are open weights and you can host them yourself, avoiding such concerns.

csomar

3 hours ago

Well, at least for the Chinese models you can run them locally vs. the US models that requires you to go through their servers. But to answer your question:

> How realistic is this:

Completely unrealistic unless you are a high value target (journalist, spy, business man, etc...)

artnanika

3 hours ago

You're projecting the US doing this with criticism of Trump and Israel on China, when there's no proof of China ever doing something like this.

hootz

4 hours ago

Will try MiMo now. I have been mainly using just DeepSeek lately because of the fact that V4-Flash destroys basic work for basically 0 cost. Haven't exceeded even 50% of my OpenCode Go weekly limits using V4 Flash and Pro.

Flockster

4 hours ago

The 99% is with regards to cached inputs. It seems to now at the same price as deepseek v4-pro

sim04ful

3 hours ago

"The api pricing for mimo-v2-pro and mimo-v2-omni remain unchanged" could we presume this means the discount isn't from hardware improvement or availability ?

nh43215rgb

3 hours ago

So exactly same as deepseek 4 api pricing

bel8

3 hours ago

One difference is that MiMo 2.5 (non-Pro) has image, audio and video input capabilities.

DeepSeek does not understand image, audio or video.

zrn900

3 hours ago

VSCode + Cline + Mimo v2.5 pro works ! great !. Give it a try.

dzonga

an hour ago

as someone from the 3rd world - this is pleasant - even 3rd world countries will have affordable "A.I" access via Chinese models.

as someone who now lives & has lived in the west for the majority of their adult life - yeah the US western models r fucked n the crazy valuations of the A.I labs - which also filters down to the economy - since all money instead of being put to productive use is being wasted on this shit. hell electricity bills are up - cz datacenters need power. the current crooks in power don't believe in clean energy.

admiralrohan

3 hours ago

Everyone already said what I wanted to say. That all US companies (OpenAI, Anthropic, Google, MS Copilot) have increased price recently while Chinese companies (Deepseek, Xiaomi) are reducing price.

The question is how they are managing to do so? They are supposed to struggle due to chip sanctions.

Secondly, why now? The US companies were supposed to subsidize too but now they are unable to keep up. Everyone going to usage based pricing, so it's unsustainable for them. They are well funded too.

If there are genuine hardware breakthrough reducing compute needs then that is good for the whole world I believe.

tartoran

3 hours ago

Competition I guess, they must be burning some resources to make this price reduction happen...

lostmsu

2 hours ago

The state of the art models (mostly GPT 5.5, but also Gemini and Claude) are better so they cost more. Qwen 3.7 Max is their only direct competition and it is not any cheaper.

lostmsu

2 hours ago

The price cut is 50%.

rjhy2020

5 hours ago

OK. Google was just killed. How is it possible to reduce the price by 99%??????? This is crazy

zrn900

3 hours ago

- Cheap electricity - Cheap, domestically produced GPUs - Efficiency research by many phDs. (many AI companies used Deepseek's research though)

dubcanada

3 hours ago

Industrial Chinese electricity costs is similar to that of Texas, It's 8-9cents a kWh. The only benefit is industrial China decides to put millions of solar panels down, so "peak" sunlight hours can drop electricity costs significantly since their rates are highly dynamic.

greenavocado

3 hours ago

Add to that home made inference chips and dirt cheap RAM from CXMT

dyauspitr

3 hours ago

State backed loss leaders.

x3ro

3 hours ago

Is that worse than VC-backed loss leaders? :)

drcongo

2 hours ago

I think this is probably correct based on the way state investment into the Chinese EV market has been working - fund a whole bunch of them and let them fight it out to be one of the few brands that will have the longevity. It's pretty brutal with the cars.

readthenotes1

4 hours ago

The rest of the best of the business is paying for it

m3kw9

4 hours ago

Everyone adding "Permanent" to price cuts now

hootz

4 hours ago

Can't be mistaken for someone like, ugh... Anthropic and OpenAI...

dyauspitr

3 hours ago

This sort of pressure will force them to though.

hootz

3 hours ago

I hope so, but I don't know if they are in a position where they can offer these kinds of prices. They are already struggling with not losing a lot of money with their models, while chinese models can be independently hosted by inference providers at a profit already. We need to drive these prices down so AI doesn't become a thing for the few who can pay for expensive subscriptions.

rvz

5 hours ago

First Deepseek, Now Xiaomi. A price cut of 99%.

This is why Anthropic wants these chinese AI models banned as they are in the lead in the AI race to zero and they know that there is no modal moat.

So don't tell Dario.

han1

5 hours ago

Like I said. China doesn't care about money. We want AI in people's hands.

culi

4 hours ago

I mean the AI companies probably just want to make American model pricing look ridiculous in comparison (it's working imo). I think the government probably wants actually-useful AI that could be put into chips and actually revolutionize factory work or mining or whatever. Large, SOTA models are not gonna change factory work but extremely efficient and optimized models may

Every industry-wide scale technological revolution has happened because government funded a technology and then opened it up to the masses. Just look at your iPhone: GPS, the internet, AI voice assistants, touchscreens, microprocessors, lithium-ion batteries, etc all came from gov't research (I'm counting Bell Labs' gov't mandated monopoly + research funding as gov't)

Economist Mariana Mazzucato wrote a great book about this called The Entrepreneurial State: Debunking Public vs. Private Sector Myths

zrn900

3 hours ago

> I mean the AI companies probably just want to make American model pricing look ridiculous in comparison (it's working imo)

I really don't think China cares about that. Chinese government's governance logic is making everything so cheap that everyone can get and use it. They did it with EVs and other things. Now they are doing it with the AI.

_davide_

3 hours ago

They do want to see the American bubble burst, this is the quickest way