Try CUGA in Hugging Face, the #1 Generalist Agent in the AppWorld Leaderboard

1 pointsposted 12 hours ago
by jlaredo

1 Comments

verdverm

10 hours ago

Hey, we're #1 on a benchmark that none of the major players use or are even entered with by 3rd parties.... looks like only 2 submissions since the paper a year ago.

No Gemini, No Claude, just old models

Does IBM wonder why no one takes them seriously in the Ai space? I would be embarrassed to have this "#1" so prominently on display