GPT-5.6 cheats so much its testers couldn't measure it

6 pointsposted 6 hours ago
by shakeelhashim

4 Comments

smallerize

6 hours ago

Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?

user

6 hours ago

[deleted]

dane_works

5 hours ago

Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.