StanAngeloff
6 hours ago
> https://artificialanalysis.ai/agents/coding-agents?coding-ag...
This is the full URL that does a composite average across DeepSWE, Terminal-Bench and SWE-Atlas-QnA. Models are measured in their respective harnesses.
What is surprising to me is that Claude Code + Fable 5 (max) is on par with Codex + GPT-5.5 (xhigh)... yet Fable burnt through 1M extra tokens.