Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult

6 pointsposted 10 hours ago
by jonesn11

2 Comments

musha68k

9 hours ago

Prompt injections + context window engineering are the combined Archilles heel of the "agentic revolution".

user

10 hours ago

[deleted]