simonw
10 days ago
I've talked to a team that's doing the dark factory pattern hinted at here. It was fascinating. The key characteristics:
- Nobody reviews AI-produced code, ever. They don't even look at it.
- The goal of the system is to prove that the system works. A huge amount of the coding agent work goes into testing and tooling and simulating related systems and running demos.
- The role of the humans is to design that system - to find new patterns that can help the agents work more effectively and demonstrate that the software they are building is robust and effective.
It was a tiny team and they stuff they had built in just a few months looked very convincing to me. Some of them had 20+ years of experience as software developers working on systems with high reliability requirements, so they were not approaching this from a naive perspective.
I'm hoping they come out of stealth soon because I can't really share more details than this.
urineeeee
10 days ago
Holy cow I actually bought this comment and it was on my mind for a bit, then saw another simonw comment about "the team" below. Check your sources folks!
Almost had me you cheeky devil you :)
spyckie2
9 days ago
What's the point honestly.
Given the pace of current ai, in 2 months dark factories will peak hype and then in another 6 months it will be fully identified in its cost/benefit drawbacks, and the wisdom of the crowds will have a relatively accurate understanding of its general usefulness, and the internet will move on to other things.
The next generation of ai coding will make dark factories legit due to their ability to architect decently. Then generation after will make dark factories obsolete due to their ability to make it right the first time. That's about 8 months out for SOTA, and 14 months out for Sonnet/Flash/Pro users.
No need for them to come out of stealth, just imagine 1000s of junior/mid engineers crammed into an office given vague instructions to build an app and spit out code. Imagine a cctv in the room overlooking the hundreds of desks, and then press fast forward 100x speed.
That's literally what they built, because that's what's possible with Opus.
daxfohl
9 days ago
The funny thing is that the rest of the software industry is dying, except for the trillions of venture capital being invested into these AI coding whatevers. But given the slow death of software, once these AI coding whatevers are finished, there's going to be nothing of value left for them to code.
But I'm sure the investors will still come out just fine.
observationist
10 days ago
You'd think at some point it'll be enough to tell the AI "ok, now do a thorough security audit, highlight all the potential issues, come up with a best practices design document, and fix all the vulnerabilities and bugs. Repeat until the codebase is secure and meets all the requisite protocol standards and industry best practices."
We're not there yet, but at some point, AI is gonna be able to blitz through things like that the way they blitz through making haikus or rewriting news articles. At some point AI will just be reliably competent.
Definitely not there yet. The dark factory pattern is terrifying, lol.
simonw
10 days ago
That's definitely a pattern people are already starting to have good results from - using multiple "agents" (aka multiple system prompts) where one of them is a security reviewer that audits for problems and files issues for other coding agents to then fix.
I don't think this worked at all well six months ago. GPT-5.2 and Opus 4.5 might just be good enough for this pattern to start being effective.
jmalicki
9 days ago
This is basically what CodeRabbit had built - they just put a ton more time into building the specialized review agents.
FEELmyAGI
10 days ago
My current dark factory stack is using a Cyber Elon [0] at CEO with a dev team consisting of Gilfoyle, 2x Mr Robots, and Pickle Rick, with Alan Turing as dev manager, easily 5x'd my output in raw performance metrics with this, and considering I had already easily achieved a 10x over baseline dev performance using vanilla agents and other mainstream AI techniques. Whenever people say AI is just glorified auto complete I know they haven't been using the latest model versions.
[0] Basically an immortal version of ELon musk with his mind fused cybernetically with Grok AI
user
9 days ago
antonvs
8 days ago
> My current dark factory stack is using a Cyber Elon as CEO
How picture perfect are its Nazi salutes?
xyzsparetimexyz
9 days ago
That's so lame dude
jwpapi
10 days ago
Honestly I’m not sure we’re not there yet, run this prompt as a ralph loop for 2 days on your codebase and see where you at...
noosphr
10 days ago
Canadian girlfriend coding strikes again.
I would love for someone to point to a codebase done by an ai with the code, history and cost that's good. It's always a ball of mud that doesn't work and even the ai that coded it up can't maintain it.
simonw
10 days ago
What were the last three that you looked at that disappointed you, and what did you find lacking with them?
noosphr
10 days ago
Instead of asking for failures why not show me a success.
You're one of the most bullish people on AI, what's the open source codebase generated entirely by AI that has impressed you the most?
simonw
10 days ago
Because I've played this game too many times before - I know that some people will find a hole in any example you show them.
So before doing that work, I want to get a feel for if you're asking this question in good faith and have done any active looking yourself.
(My favorite two recent open source examples are https://simonwillison.net/2026/Jan/27/one-human-one-agent-on... and https://github.com/antirez/flux2.c)
noosphr
10 days ago
>>What's the open source codebase generated entirely by AI that has impressed you the most?
>One Human + One Agent = One Browser From Scratch
I at least expect you to read my post before replying.
simonw
10 days ago
What do you mean? Are you suggesting that the "one human" means it wasn't entirely written by AI?
That's not the case, the "one human" there is the one human prompting it: https://emsh.cat/one-human-one-agent-one-browser/
If your goalpost here is "no human involved at all" then it's a good thing I asked you what your goalposts were before spending any time on this!
UPDATE: OK I think I see what's happened here! You're asking to see an open source repo that was built using the "dark factory" pattern, where no code was even reviewed by a human.
I don't think I've seen one of those yet - I mean maybe that Cursor FastRender thing comes close?
It's a very radical technique. I don't think many people are trying this yet - I haven't been brave enough to try it myself yet.
I guess I kind of did that with my Python WASM library? That was an experiment in how far I could get with prompting and not reviewing, but it's not something I'd hold up as a shining example of how projects should be built: https://github.com/simonw/pwasm
noosphr
9 days ago
Your original post in the thread is about an automated dark factory with thousands of AI agents. It's amazing but we can't see it because they are in stealth mode.
Then the first example of a project done by AI without human intervention is someone who _explicitly_ states that they drove the way the agent behaved.
From the blog:
>The human who drives the agent might matter more than how the agents work and are set up, the judge is still out on this one
>If one person with one agent can produce equal or better results than "hundreds of agents for weeks", then the answer to the question: "Can we scale autonomous coding by throwing more agents at a problem?", probably has a more pessimistic answer than some expected.
I'm really not understanding what this proves other than the fact that AI + human is great and AI + AI is shit. Something that both me and the person who did the browser agreed on: https://news.ycombinator.com/item?id=46783282
simonw
9 days ago
Yeah, the "dark factory" thing is basically unproven right now. I reported on what I'd seen because it was genuinely fascinating, and a potential glimpse into how this stuff might work. I'm not ready to say that it's a good idea or that it's demonstrated to work outside of a demo I saw for an hour a couple of months ago that looked credible to me at the time.
ElFitz
9 days ago
> Yeah, the "dark factory" thing is basically unproven right now.
Isn’t that as close as "it" gets for now?
> but then I started trusting the model more and more. These days I don’t read much code anymore. I watch the stream and sometimes look at key parts, but I gotta be honest - most code I don’t read. I do know where which components are and how things are structured and how the overall system is designed, and that’s usually all that’s needed.
ben_w
9 days ago
> Isn’t that as close as "it" gets for now?
Always has been.
Even when I was a kid, people were saying all software is either a prototype or obsolete.
The difference is the cycle got compressed from half of what we know becoming obsolete every 18 months, we just don't know which half, to every 18 weeks.
qingcharles
9 days ago
My biggest project (in LOCs) is 100% AI written and I've given up reviewing the code on it. Huge web-based content management system with a native desktop app companion. It's worked flawlessly 24/7 for the last couple of months. I add a new feature every week or so, but I just do the code-as-English dance now and test what comes out. It's almost exclusively all Gemini 3 Pro and Opus 4.5. I've gone fully dark on that project.
I have other projects where I review almost every line, but everything is edging towards the dark side.
I've been coding for 40 years in every language you can think of. Glad that's over, honestly. It always got in the way of turning an idea into a product.
antonvs
8 days ago
> Nobody reviews AI-produced code, ever. They don't even look at it.
How is this supposed to differ from the original Karpathy definition of vibe coding? Is it just "vibe coding plus rigorous verification"?
(Or is it mainly intended to sound more desirable than vibe coding?)
simonw
8 days ago
"vibe coding plus rigorous verification" is a really good way of describing it.