ad-tech
5 hours ago
The voting thing breaks because youre treating all models equally when they shouldnt be. We ran consensus logic like this on a smaller scale and quickly realized throwing 5 mediocre models at a problem just makes them argue in circle. One good model beats three bad ones always. The synthesis round will get expensive fast too - we started with 2 models doing 3 rounds and it was already costing 40x a single pass. For brainstorm mode maybe weight models by past accuracy instead of pure voting? We do this with our team internally - the person who got it right last time gets listened to more next time, not equal voice to everyone. Could be interesting to test.
ElFitz
2 hours ago
Why would the synthesis round get expensive than the regular rounds?
> and quickly realized throwing 5 mediocre models at a problem just makes them argue in circle.
What was your selection strategy? My current issue is more that the more models I add, the less likely any specific one is to win two rounds in a row. Which would make perfect sense no matter the model quality, no? Unless there’s a huge gap.
> For brainstorm mode maybe weight models by past accuracy instead of pure voting?
By adding outputs history and a way to track the actual outcomes?