k08200
5 hours ago
A big part of the cost explosion is using the frontier model for tasks that don't need to be done. I tried using the gpt-4o and much cheaper models, and the cheaper ones were more accurate in my three — paying for the reasoning depth that I don't use. The other half is asking the model to do what deterministic rules should be. Calls that don't are the cheapest. Starting with profiling what calls the larger model really needs.