snowmobile
9 hours ago
Wait, so you don't trust the AI to execute code (shell commands) on your own computer, so therefore need a safety guardrail, in order to facilitate it writing code that you'll execute on your customers' computers (the financial analysis tool)?
And adding the fact that you used AI to write the supposed containment system, I'm really not seeing the safety benefits here.
The docs also seem very AI-generated (see below). What part did you yourself play in actually putting this together? How can you be sure that filtering a few specific (listed) commands will actually give any sort of safety guarantees?
https://github.com/borenstein/yolo-cage/blob/main/docs/archi...
borenstein
9 hours ago
You are correct both that the AI wrote 100% of the code (and 90% of the raw text). You are also correct that I want a safety guardrail for the process by which I build software that I believe to be safe and reliable. Let's take a look at each of these, because they're issues that I also wrestled with throughout 2025.
What's my role here? Over the past year, it's become clear to me that there are really two distinct activities to the business of software development. The first is the articulation of a process by which an intent gets actualized into an automation. The second is the translation of that intent into instructions that a machine can follow. I'm pretty sure only the first one is actually engineering. The second is, in some sense, mechanical. It reminds me of the relationship between an architect and a draftsperson.
I have been much freer to think about engineering and objectives since handing off the coding to the machine. There was an Ars Technica article on this the other day that really nails the way I've been experiencing this: https://arstechnica.com/information-technology/2026/01/10-th...
Why do I trust the finished product if I don't trust the environment? This one feels a little more straightforward: it's for the same reason that construction workers wear hard hats in environments that will eventually be safe for children. The process of building things involves dangerous tools and exposed surfaces. I need the guardrails while I'm building, even though I'm confident in what I've built.
visarga
8 hours ago
> it's for the same reason that construction workers wear hard hats in environments that will eventually be safe for children.
Good response, but more practically, when you are developing a project you allow the agent to do many things on that VM, but when you deliver code it has to actually pass tests. The agent working is not being tested live, but the code delivered is tested before use. I think tests are the core of the new agent engineering skill - if you can have good tests you automated your human in the loop work to a large degree. You can only trust a code up to the level of its testing. LGTM is just vibes.
eikenberry
6 hours ago
> The first is the articulation of a process by which an intent gets actualized into an automation. The second is the translation of that intent into instructions that a machine can follow.
IMO this pattern fails on non-trivial problems because you don't know how the intent can be actualized into automation without doing a lot of the mechanical conversion first to figure out how the intent maps to the automation. This mapping is the engineering. If you can map the intent to actualization without doing it, then this is a solved problem in engineering and be usable by non-engineers. Relating this to your simile, it is more like a developer vs. an architect where the developer uses pre-designed building vs. the architect which needs to design a new building to meet a certain set of design requirements.
hmokiguess
9 hours ago
Thank you so much for this analogy. This reminded me how I’ve always bike without a helmet, even though I’ve been in crashes and hits before, it just isn’t in my nature to worry about safety in the same way others do I guess? People do be different and it’s all about your relationship with managing and tolerating risk.
(I am not saying one way is better than the other, it’s just different modes of engaging with risk. I obviously understand that having a helmet can and would save my life should an accident occur. The keyword here is “should/would/can” which some people believe in “shall/will/does” and prefer to live this way. Call it different faith or belief systems I guess)
KurSix
5 hours ago
The logic is Defense in Depth. Even if the "cage" code is AI-written and imperfect, it still creates a barrier. The probability of AI accidentally writing malicious code is high. The probability of it accidentally writing code that bypasses the imperfect protection it wrote itself is much lower
snowmobile
4 hours ago
Defense in depth doesn't mean throwing a die twice and hoping you don't get snake eyes. The AI-generated docs claim that the AI-generated code only filters specific actions, so even if it manages to do that correctly it's not a lot of protection.