Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult

6 pointsposted 2 months ago
by jonesn11

2 Comments

musha68k

2 months ago

Prompt injections + context window engineering are the combined Archilles heel of the "agentic revolution".

user

2 months ago

[deleted]