I'm building a 30k‑line V12 codebase solo with a "team" of 4 AIs

7 pointsposted 8 hours ago
by garylauchina

Item id: 46462910

5 Comments

garylauchina

2 hours ago

Small correction on my own post: I just went back and measured things properly and… I was way off.

The codebase isn’t ~30k lines anymore – it’s around 223k lines of code, plus about 243k lines of Markdown (research notes, design docs, experiment logs, etc.). So the “large system + context window” pain I’m describing is happening at a significantly bigger scale than I initially claimed.

Which also explains why I’ve become slightly obsessed with measurement, contracts, and forcing everything through text before letting the “programmer” AI touch the code.

nunobrito

7 hours ago

You really seem on top of your game, just to mention that 30k code bases aren't typically complex on my experience. When they, it just means you need to simplify the overall struture, unify as much as possible the functions

garylauchina

6 hours ago

Fair point – 30k LOC by itself isn’t “complex” in any absolute sense. In my case the difficulty isn’t the raw size, it’s:

long‑running experiments and measurement contracts that span many modules,

an evolving research agenda (I’m at V12 now, with quite a few pivots along the way),

and the fact that I’m doing all the design + implementation solo.

I agree that when something feels complex at this scale, it’s often a sign the structure needs to be simplified and unified. A big part of what I use the “architect” agent for is exactly that: forcing myself to rewrite and consolidate interfaces instead of just piling on more code.

zkmon

8 hours ago

How does the hand-off work among the team roles? Is it a water-fall or iterative? How does the programmer complain about something that's wrong or inconsistent in architecture or requirements?

garylauchina

6 hours ago

It’s iterative, not a strict waterfall, but the constraint is “text first, code second”.

A typical loop looks like this:

I talk with Perplexity/ChatGPT to clarify the requirement and trade‑offs.

The “architect” Cursor window writes a short design note: intent, invariants, interfaces, and a checklist of concrete tasks.

The “programmer” Cursor window implements those tasks, one by one, and I run tests / small experiments.

If something feels off, I paste the diff and the behavior back to the architect and we adjust the design note, then iterate.

There is a feedback channel from “programmer” to “architect”, but it goes through me. When the programmer model runs into something that doesn’t fit (“this API doesn’t exist”, “these two modules define similar concepts in different ways”, etc.), I capture that as:

comments in the code (“this conflicts with X”), and

updates to the architecture doc (“rename Y to Z, merge these two concepts”, “deprecate this path”, etc.).

So the architect is not infallible. It gets corrected by reality: tests failing, code being awkward to write, or new edge cases showing up. The main thing the process enforces is that those corrections are written down in prose first, so future changes don’t silently drift away from whatever the system used to be.

In that sense the “programmer” complains the same way human ones do: by making the spec look obviously wrong when you try to implement it.