Try doing a head to head comparison using all LLM tricks available including prompt engineering, rag, reasoning, inference time compute, multiple agents, tools, etc
Then try the same thing using fine tuning. See which one wins. In ML class we have labeled datasets with breeds of dogs hand labeled by experts like Andrej, in real life users don’t have specific, clearly defined, and high quality labeled data like that.
I’d be interested to be proven wrong
I think it is easy for strong ML teams to fall into this trap because they themselves can get fine tuning to work well. Trying to scale it to a broader market is where it fell apart for us.
This is not to say that no one can do it. There were users who produced good models. The problem we had was where to consistently find these users who were willing to pay for infrastructure.
I’m glad we tried it, but I personally think it is beating a dead horse/llama to try it today
I mean, at the point where you’re writing tools to assist it, we are no longer comparing the performance of 2 LLMs. You’re taking a solution that requires a small amount of expertise, and replacing it with another solution that requires more expertise, and costs more. The question is not “can fine tuning alone do better than every other trick in the book plus a SOTA LLM plus infinite time and money?” The question is: “is fine tuning useful?”
Fair didn’t seem to matter to users who just wanted to build solutions with reasonable time and budget
If your customers can't fine tune, do it for them instead.
How can you hire enough people to scale that while making the economics work?
Why would they join you rather than founding their own company?
> How can you hire enough people to scale that while making the economics work?
Once you (as in you the person) have the expertise, what you need all the people for exactly? To fine-tuning you need to figure out the architecture, how to train, how to infer, pick together the dataset and then run the training (optionally setup a pipeline so the customer can run the "add more data -> train" process themselves). What in this process you need to hire so many people for?
> Why would they join you rather than founding their own company?
Same as always, in any industry, not everyone wants to lead and not everyone wants to follow.
I think you are saying to go after the very high end of the market.
That’s fair, one market segment of this is sometimes called sovereign compute.
Another common model that I have seen is to become the deepmind for one very large and important customer.
I think this works.
> How can you hire enough people to scale that while making the economics work?
Pick the right customers.
> Why would they join you rather than founding their own company?
The network effects of having enough resources in one place. For having other teams deal with the training data, infrastructure, deployment, etc.