muin_kr
11 hours ago
The "fewer wasted tokens" framing is exactly right. The biggest practical bottleneck with coding agents isn't model quality — it's context window pollution from irrelevant code.
Interesting that you went with a hybrid semantic + lexical approach. Pure semantic search on code tends to miss exact identifier matches, and pure lexical misses conceptual similarity. The combination is the right call.
How does indexing scale? For a 10k-file monorepo, what's the initial index time and index size? And does it handle incremental updates (only re-index changed files), or is it full re-index each time?
Yerzhigit
6 hours ago
1. Indexing is incremental — only changed files get re-embedded either on subsequent runs or with the watch mode command in the background. So no, it is not full re-index each time, unless forced. 2. 63k lines of Rust code took about 4 minutes on my 6 GB VRAM 3060 RTX on laptop for full indexing. For a 10k-file monorepo, I honestly haven't tested yet. The vector store is flat brute-force (no HNSW), works well under ~50k chunks. At 10k files, you'd probably hit 200k+ chunks, so that part would need work. If you have a repo that size, I'd love to know how it goes.