stavros
6 hours ago
I don't understand why the takeaway here is (unless I'm missing something), more or less "everything is going to get exploited all the time". If LLMs can really find a ton of vulnerabilities in my software, why would I not run them and just patch all the vulnerabilities, leading to perfectly secure software (or, at the very least, software for which LLMs can no longer find any new vulnerabilities)?
Veserv
6 hours ago
When did we enter the twilight zone where bug trackers are consistently empty? The limiting factor of bug reduction is remediation, not discovery. Even developer smoke testing usually surfaces bugs at a rate far faster than they can be fixed let alone actual QA.
To be fair, the limiting factor in remediation is usually finding a reproducible test case which a vulnerability is by necessity. But, I would still bet most systems have plenty of bugs in their bug trackers which are accompanied by a reproducible test case which are still bottlenecked on remediation resources.
This is of course orthogonal to the fact that patching systems that are insecure by design into security has so far been a colossal failure.
reactordev
5 hours ago
That might have been true pre LLMs but you can literally point an agent at the queue until it’s empty now.
batshit_beaver
5 hours ago
You literally cannot, since ANY changes to code tend to introduce unintended (or at least not explicitly requested) new behaviors.
lll-o-lll
5 hours ago
Eventual convergence? Assuming each defect fix has a 30% chance of introducing a new defect, we keep cycling until done?
saintfire
5 hours ago
Assuming you can catch every new bug it introduces.
Both assumptions being unlikely.
You also end up with a code base you let an AI agent trample until it is satisfied; ballooned in complexity and redudant brittle code.
charcircuit
4 hours ago
You can have an AI agent refactor and improve code quality.
abakker
2 hours ago
But, have you any code that has been vetted and verified to see if this approach works? This whole Agentic code quality claim is an assertion, but where is the literal proof?
Kinrany
5 hours ago
Why would it converge?
reactordev
5 hours ago
I’ve had mine on a Ralph loop no problem. Just review the PR..
k_roy
5 hours ago
Which still means a single person with Claude can clear a queue in a day versus a month with a traditional team.
bsder
4 hours ago
The fact that KiCad still has a ton of highly upvoted missing features and the fact that FreeCAD still hasn't solved the topological renumbering problem are existence proofs to the contrary.
rybosworld
2 hours ago
Shouldn't be down voted for saying this. There are active repo's this is happening in.
"BuT ThE LlM iS pRoBaBlY iNtRoDuCiNg MoRe BuGs ThAn It FiXeS"
This is an absurd take.
layer8
6 hours ago
The pressure to do so will only happen as a consequence of the predicted vulnerability explosion, and not before it. And it will have some cost, as you need dedicated and motivated people to conduct the vulnerability search, applying the fixes, and re-checking until it comes up empty, before each new deployment.
The prediction is: Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a source tree and typing “find me zero days”.
cartoonworld
6 hours ago
I feel like the dream of static analysis was always a pipe.
When the payment for vulns drops i'm wondering where the value is for hackers to run these tools anymore? The LLMs don't do the job for you, testing is still a LOT OF WORK.
tptacek
6 hours ago
That might be one outcome, especially for large, expertly-staffed vendors who are already on top of this stuff. My real interest in what happens to the field for vulnerability researchers.
lifty
5 hours ago
Perhaps a meta evolution, they become experts at writing harnesses and prompts for discovering and patching vulnerabilities in existing code and software. My main interest is, now that we have LLMs, will the software industry move to adopting techniques like formal verification and other perhaps more lax approaches that massively increase the quality of software.
habinero
2 hours ago
Testing exists.
> formal verification
Outside of limited specific circumstances, formal verification gives you nothing that tests don't give you, and it makes development slow and iteration a chore. People know about it, and it's not used for lot of reasons.
stavros
5 hours ago
True, but I already am curious to see what happens in a multitude of fields, so this is just one more entry in that list.
underdeserver
5 hours ago
Just wanted to point out that tptacek is the blog post's author (and a veteran security researcher).
cvwright
2 hours ago
Find-then-patch only works if you can fix the bugs quicker than you’re creating new ones.
Some orgs will be able to do this, some won’t.
stavros
an hour ago
"Find me vulnerabilities in this PR."
Buttons840
6 hours ago
> If LLMs can really find a ton of vulnerabilities in my software, why would I not run them and just patch all the vulnerabilities, leading to perfectly secure software?
Probably because it will be a felony to do so. Or, the threat of a felony at least.
And this is because it is very embarrassing for companies to have society openly discussing how bad their software security is.
We sacrifice national security for the convenience of companies.
We are not allowed to test the security of systems, because that is the responsibility of companies, since they own the system. Also, companies who own the system and are responsible for its security are not liable when it is found to be insecure and they leak half the nations personal data, again.
Are you seeing how this works yet? Let's not have anything like verifiable and testable security interrupt the gravy train to the top. Nor can we expect systems to be secure all the time, be reasonable.
One might think that since we're all in this together and all our data is getting leaked twice a month, we could work together and all be on the lookout for security vulnerabilities and report them responsibly.
But no, the systems belong to companies, and they are solely responsible. But also (and very importantly) they are not responsible and especially they are not financially liable.
gruez
5 hours ago
>> If LLMs can really find a ton of vulnerabilities in my software, why would I not run them and just patch all the vulnerabilities, leading to perfectly secure software?
>Probably because it will be a felony to do so. Or, the threat of a felony at least.
"my software" implies you own it (ie. your SaaS), so CFAA isn't an issue. I don't think he's implying that vigilante hackers should be hacking gmail just because they have a gmail account.
zar1048576
6 hours ago
My sense is that the asymmetry is non-trivial issue here. In particular, a threat actor needs one working path, defenders need to close all of them. In practice, patching velocity is bounded by release cycles, QA issues / regression risk, and a potentially large number of codebases that need to be looked at.
woeirua
2 hours ago
Because not all software gets auto-updated. Most of it does not!
htrp
5 hours ago
Attackers only have to be successful once while defenders have to be successful all the time?
joatmon-snoo
5 hours ago
Breaking something is easier than fixing it.
tptacek
5 hours ago
People have said that for decades and it wasn't true until recently.
joatmon-snoo
2 hours ago
Hmm: can you elaborate?
I've never been on a security-specific team, but it's always seemed to me that triggering a bug is, for the median issue, easier than fixing it, and I mentally extend that to security issues. This holds especially true if the "bug" is a question about "what is the correct behavior?", where the "current behavior of the system" is some emergent / underspecified consequence of how different features have evolved over time.
I know this is your career, so I'm wondering what I'm missing here.
tptacek
2 hours ago
It has generally been the case that (1) finding and (2) reliably exploiting vulnerabilities is much more difficult than patching them. In fact, patching them is often so straightforward that you can kill whole bug subspecies just by sweeping the codebase for the same pattern once you see a bug. You'd do that just sort of as a matter of course, without necessarily even qualifying the bugs you're squashing are exploitable.
As bugs get more complicated, that asymmetry has become less pronounced, but the complexity of the bugs (and their patches) is offset by the increased difficulty of exploiting them, which has become an art all its own.
LLMs sharply tilt that difficulty back to the defender.
underdeserver
5 hours ago
Specifically in software vulnerability research, you mean.
Fixing vulnerable code is usually trivial.
In the physical world breaking things is usually easier.
charcircuit
5 hours ago
A proper fix maybe. But LLMs can easily make it no longer exploitable in most cases.