Hackernews
new
show
ask
jobs
DeepSWE: Measuring frontier coding agents on original, long-horizon SWE tasks
2 points
posted 5 hours ago
by WarmWash
(deepswe.datacurve.ai)
No comments yet