pbd
4 days ago
GPT-4 at $24.7 per million tokens vs Mixtral at $0.24 - that's a 100x cost difference! Even if routing gets it wrong 20% of the time, the economics still work. But the real question is how you measure 'performance' - user satisfaction doesn't always correlate with technical metrics.
FINDarkside
4 days ago
It's trivial to get better score than GPT-4 with 1% of the cost by using my propertiary routing algorithm that routes all requests to Gemini 2.5 Flash. It's called GASP (Gemini Always, Save Pennies)
nutjob2
4 days ago
Does anyone working in an individual capacity actually end up paying for Gemini (Flash or Pro)? Or does Google boil you like a frog and you end up subscribing?
baq
4 days ago
If I actually had time to work on my hobby projects Gemini pro would be the first thing I’d spend money on. As is, it’s amazing how much progress you can squeeze out of those 5 chats every 24h; I can get a couple hours of before-times hacking done in 15 minutes, which is incidentally when free usage gets throttled and my free time runs out.
aspect8445
4 days ago
I've used Gemini in a lot of personal projects. At this point I've probably made tens of thousands of requests, sometimes exceeding 1k per week. So far, I haven't had to pay a dime!
worm00111
4 days ago
How come you don't need to pay? Do you get it for free somehow?
KETHERCORTEX
4 days ago
There's free tier for API.
drittich
4 days ago
"When you use Unpaid Services, including, for example, Google AI Studio and the unpaid quota on Gemini API, Google uses the content you submit to the Services and any generated responses to provide, improve, and develop Google products and services and machine learning technologies, including Google's enterprise features, products, and services, consistent with our Privacy Policy.
To help with quality and improve our products, human reviewers may read, annotate, and process your API input and output. Google takes steps to protect your privacy as part of this process. This includes disconnecting this data from your Google Account, API key, and Cloud project before reviewers see or annotate it. Do not submit sensitive, confidential, or personal information to the Unpaid Services."
Reference: https://ai.google.dev/gemini-api/terms
ivape
4 days ago
You get 1500 prompts on AIStudio across a few Gemini flash models. I think I saw 250 or 500 for 2.5. It’s basically free and beats the consumer rate limits of big apps (Claude, ChatGPT, Gemini, meta). I wonder when they’ll cut this off.
dcre
4 days ago
I've paid a few dollars a month for my API usage for about 6 months.
simpaticoder
4 days ago
PPT (price-per-token) is insufficient to compute cost. You will also need to know an average tokens-per-interaction (TPI). They multiply to give you a cost estimate. A .01x PPT is wiped out by 100x TPI.
monsieurbanana
4 days ago
Are you saying that some models will take 100x more tokens than other (models in the same ballpark) for the same task? Is the 100 a real measured metric or just random numbers to illustrate a point?
simpaticoder
4 days ago
With thinking models, yes 100x is not just possible, but probable. You get charged for the intermediate thinking tokens, even if you don't see them (which is the case for Grok, for example). And even if you do see them, they won't necessarily add value.
monsieurbanana
3 days ago
> With thinking models, yes 100x is not just possible, but probable
So the answer is no then, because I don't put reasoning and non-reasoning models in the same ballpark when it comes to token usage. You can just turn off reasoning.
datadrivenangel
4 days ago
the GPT 5 models use ~10x more tokens depending on the reasoning settings.
Keyframe
4 days ago
number of complaints / million tokens?
mkoubaa
4 days ago
> How you measure 'performance'
I heard the best way is through valuations
pqtyw
4 days ago
> GPT-4 at $24.7 per million tokens
While technically true why would you want to use it when OpenAI itself provides a bunch of many times cheaper and better models?
KTibow
4 days ago
RouterBench is from March 2024.