InitialPhase55
11 hours ago
Curious, how did you settle on Haiku/Sonnet? Because there are much cheaper models on OpenRouter that probably perform comparatively...
Consider Haiku 4.5: $1/M input tokens | $5/M output tokens vs MiniMax M2.7: $0.30/M input tokens | $1.20/M output tokens vs Kimi K2.5: $0.45/M input tokens | $2.20/M output tokens
I haven't tried so I can't say for sure, but from personal experience, I think M2.7 and K2.5 can match Haiku and probably exceed it on most tasks, for much cheaper.
lanyard-textile
2 hours ago
Since they're opening it publicly on irc here, the safety rails might be a consideration. I've made an agent recently and that's why I'm paying a premium to Anthropic atm -- Though I'm still experimenting to see if it's really necessary.
It's getting some organic usage -- 100M input tokens for just chats this month -- and I've seen enough users try to throw Haiku against the wall and failing to trick it into misbehaving. It "pumps the breaks" a lot and imitates annoyance when you ask it repeatedly :) Handles emotionally driven real-life questions mid-conversation well. It just works.
Not seeing all that consistently with other models I've tried so far -- but I've assumed it's not a completely fair comparison with (e.g.) open weights, since these safety rails are presumably not always arising from the natural model calls.
nl
6 hours ago
Xiaomi Mimo v2-Flash is fantastic.
I have a relatively hard personal agentic benchmark, and Mimo v2-Flash scores 8% higher in 109 seconds for $0.003 (0.3 cents!) vs Haiku which took 262 seconds for $0.24 (24 cents)
Gemini 3.1 Flash Lite Preview (yes that is its name) is also a solid choice.
ruguo
8 hours ago
MiniMax M2.7 is actually pretty solid. I’ve been using it for coding lately and it handles most tasks just fine, but Opus 4.6 is still on another level.
jeremyjh
8 hours ago
MiniMax's Token Plan is even less expensive and agent usage is explicitly allowed.
faangguyindia
8 hours ago
just use gemini flash3, it's better than haiku
0123456789ABCDE
2 hours ago
unless gp really cares about lower hallucination rates
https://artificialanalysis.ai/?omniscience=omniscience-hallu...
attentive
7 hours ago
or better yet 3.1 Flash-Lite at $0.25/1M input
ls612
8 hours ago
Because this is probably paid marketing by Anthropic?