Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets

3 pointsposted 8 hours ago
by lieret

1 Comments

jryio

6 hours ago

Is competition + limited resources (e.g. Core War) = selection pressures (natural or otherwise).

Can we integrate and bring back reinforcement learning in a framework like this?