Hackernews
new
show
ask
jobs
ImpossibleBench: Measuring Reward Hacking in LLM Coding Agents
2 points
posted 11 hours ago
by gmays
(lesswrong.com)
No comments yet