Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets

5 pointsposted 3 months ago
by lieret

1 Comments

jryio

3 months ago

Is competition + limited resources (e.g. Core War) = selection pressures (natural or otherwise).

Can we integrate and bring back reinforcement learning in a framework like this?