samuelknight
10 hours ago
My startup is building agents for automating pentesting. We started experimenting with Llama 3.1 last year. Pentesting with agents started getting good around Sonnet 3.5 v1.
The switch from Sonnet 4 to 4.5 was a huge step change. One of our beta testers ran our agent on a production Active Directory network with ~500 IPs and it was able to privilege escalate to DA within an hour. I've seen it one-shot scripts to exploit business logic vulnerabilities. It will slurp down JS from websites and sift through for api endpoints, then run a python server to perform client side anaysis. It understands all of the common pentesting tools with minor guard rails. When it needs an email to authenticate it will use one of those 10 minute fake email websites with curl and playwright. I am conservative about my predictions but here is what we can learn from this incident and what I think is inevitably next:
Chinese attackers used Anthropic (a hostile and expensive platform) because American SOTA is still ahead of Chinese models. Open weights is about 6-9 months behind closed SOTA. So by mid 2026 hackers will have the capability to secretly host open weight models on generic cloud hardware and relay agentic attacks through botnets to any point on the internet.
There is an arms race between the blackhats and private companies to build the best hacking agents, and we are running out of things the agent CAN'T do. The major change from Claude 4 - Claude 4.5 was the ability to avoid rate limiting and WAF during web pentests, and we think that the next step for this is AV evasion. When Claude 4.7 comes out, if it is able to effectively evade anti-virus, companies are in for a rude awakening. Just my two cents.