Hackernews
new
show
ask
jobs
ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases
2 points
posted 9 hours ago
by BalinKing
(arxiv.org)
No comments yet