dofm
an hour ago
No it's not. This has always been a needlessly iconoclastic rather than sensible suggestion.
At the very least it is not once you're working at the wrong kind of scale.
Once you have an awkward number of customers (more than five and less than a hundred), maintaining duplicated code that should have been abstracted and modularised will only seem cheap if you don't mind that you burn through even junior employees at a pace.
And in the LLM era the wrong kind of scale appears in different ways; code generated and duplicated without proper abstraction and then maintained by an LLM that cannot be trusted to do the same modification each time it encounters a pattern or to have enough of an overview to slowly rescue duplicated code through good abstractions.
I would go as far as to say that any abstraction you can maintain (that is in active maintenance, I mean) is better than code duplication once you are past a de minimis threshold.
coldtea
41 minutes ago
Hardly iconoclastic, it's a very sensible suggestion.
It would be iconoclastic if the common sense basic approach would be to start with abstraction. It's not, the common sense default is to write possibly duplicate behavior until you actually discover several cases to abstract away, until you bevalop a sensible idea of which functionality unites them and which doesn't carry over all of them.
>Once you have an awkward number of customers (more than five and less than a hundred), maintaining duplicated code that should have been abstracted and modularised will only seem cheap if you don't mind that you burn through even junior employees at a pace
Maintaining the wrong abstraction, or, god help, abstractions, would be even worse.
shinycode
23 minutes ago
At work there’s been a huge number of duplication in the start of the company and no solid abstraction. So no tests as well. We introduced tests in the current architecture but rewriting code has a huge cost to make sure there is no regression. When we talk about a saas it’s non-trivial with many customers relying on this tool daily as part of their workflow, regressions because of rewrite could be really painful for them. So we must give a greater budget to take the time to make sure nothing major breaks. So there is a debt that is compounding over time because code is added. Duplication is bad and weird/purist abstraction could make the architecture so rigid that rewriting things could generate hard to understand and catch bugs. It’s hard to find a good balance and it depends on the kind of business and scale of project. Hard to make that a generic advice.
chairmansteve
5 minutes ago
"It’s hard to find a good balance and it depends on the kind of business and scale of project".
Exactly. The abstraction purists are not working in the messy, dead line driven real world.
dofm
34 minutes ago
> Maintaining the wrong abstraction, or, god help, abstractions, would be even worse.
Hard disagree. When you've had to chase through a change in untold and actually unknown numbers of duplications of code in different permutations and fix them because they are all on fire simultaneously, you'd disagree too. A bad abstraction would at least have had one fire in one place.
davidee
16 minutes ago
Good faith question: would it?
Wouldn't most large codebases with poor abstractions just have engineers engineer around them with their own solutions? In a large enough codebase you'd have both the bad abstractions and all the not-quite-duplicate implementations ignoring the bad abstraction?
I'm using bad here loosely, it could be buggy, incorrect, incomplete, insufficient and more; while being owned by someone or some team that's a challenge to work with for various reasons (overloaded, under-resourced, overbearing, etc., etc.).
dofm
7 minutes ago
> Wouldn't most large codebases with poor abstractions just have engineers engineer around them with their own solutions?
Obviously, yes. But it is my experience that this happens more slowly and that API invocations that break when the abstraction is changed are much easier to identify than broader duplicated patterns of code that span many lines and subtly diverge.
And even then those divergences are better because each wrapper around the abstraction is documenting the problem with it. But the abstraction can at generally be replaced by one with the same API surface.
(Even if you take into account the fact that any API behaviour ultimately gets relied upon even if undocumented. Which is true.)
To be fair my experience is that of a freelancer and contractor who arrives trying to fix things that have been through many such hands. And I think if these developers had it drummed into their head that any attempt at abstraction would be better than copy and paste, these situations would be more knowable.
coldtea
3 minutes ago
>A bad abstraction would at least have had one fire in one place.
On the contrary: that's precisely what a bad abstraction would not offer.
Instead it would spread its assumptions to different parts of the system, as every caller, sub-service, etc. would have to change shape to fit in that abstraction's box, however unnatural it is (and we know it would be unnatural, because we already said it's a bad abstraction).
Abstraction is not the same as encapsulation.
rpdillon
14 minutes ago
In your mind, what's the cost of the wrong abstraction?
bluefirebrand
38 minutes ago
Yeah, "Write Everything Twice" is a pretty common and sensible direction for any codebase
cwmoore
20 minutes ago
Yeah, ~"Write Everything Twice"~ “Copy and Paste Working Code” is a pretty common and sensible direction for any codebase
fny
an hour ago
Code duplication is cheaper than the wrong abstraction. If you have a good abstraction, you should run with it.
If you haven't figured out a good abstraction at 5-100 customers, God help you.
feoren
8 minutes ago
A good abstraction? As in one? I'd go so far as to say the process of discovering and refining abstractions is the most important part of software engineering. A large project has dozens of abstractions, and some of them are "wrong" at any time, as you discover over time. None are ever perfect. If you wait to stop duplicating code until you have the "right" abstraction, you are just putting off the hard part of developing software and taking on tech debt.
Half of your abstractions are wrong. The hard part is knowing which half.
enos_feedler
7 minutes ago
What if there is no good abstraction for the entire stack of software on each of computers? What if we built a common one because we had to? What if now we get to all make our own with natural language?
dofm
30 minutes ago
I disagree.
But also it's very possible to not realise you needed an abstraction until it catches fire in multiple places.
And quite often it's not you that got the codebase to a hundred customers, is it? Sometimes it is a sequence of fresh-faced young developers who didn't have the authority to say "this duplication is bullshit" and were instead compelled to repeat it.
I think a lot of these discussions happen in nice little blog-post vacuums of progressive thinking, where people can go "mmm, object oriented coding obscures intent and clarity, mmm", blog posts with "an X is a Y", "the unreasonable effectiveness of foobar" etc.
In the real world, every duplication that works sticks for good; there is rarely budget to electively replace code that isn't broken. Until one day it doesn't work. And then… how many times is it actually duplicated? How many of the duplicates diverged? How many of these do we no longer need?
chairmansteve
9 minutes ago
> I disagree.
So... the wrong abstraction, no matter how bad, is better than code duplication?
ChrisMarshallNY
14 minutes ago
In my experience, the answer is always "It Depends." That's about the only thing that I can hang "always" on.
It really depends on the exact type of code we're working with, and what our objectives are.
In my case, I often use object inheritance. It's a damn cheap way to DRY. However, when people hear "inheritance," they often think "polymorphism." There's a really big difference between the two, but popular culture has jammed them into one ball, and it's not worth the agita, to try to explain the difference.
But if you are doing optimization, long stacks can be your enemy, and inheritance tends to have long, windy stacks.
In these cases, the copy/pasta method may well be the best approach.
Like I said, "It Depends."
mytydev
24 minutes ago
It sounds to me like you are describing a good abstraction. This article does not claim that code duplication is better than any abstraction. It claims that code duplication is better than the wrong abstraction. I'm sure this author would agree that a good abstraction is better than code duplication.
dofm
20 minutes ago
I'm afraid this comment reads in a rather gnomic way.
Of course it's a truism if you just say any abstraction that works is a good abstraction.
That is not what I am saying at all. Bullshit abstractions at least let you control the problem. Duplication doesn't.
vlunkr
13 minutes ago
But it’s never going to be 1:1 duplication is it? Sometimes it’s better to copy code as a template for something new, rather than try to immediately force a new abstraction.
I agree with you that it’s a truism, but it’s useful advice for people who have a habit of trying too hard to DRY their code. IIRC the author comes from the Ruby world, where DRY was a big thing, and this talk was part of the pendulum swinging back away from this DRY obsession that sometimes just resulted in convoluted code.
agumonkey
33 minutes ago
You seem to have experience, I dont mind factoring / unifying logic, when done sensibly with enough history in the trenches. It pains me more whenever a young dev comes in and barks "we must merge these two things!" repeatedly without planning for more than two cases and starting to add more and more boolean variables. Crystal makers. Then the obvious issue comes, the two variants weren't that close and now there's one god class trying to handle all forces in one big state.
I agree that LLMs are naturally anti abstraction machines.. I'm often trying to find way to reverse that.
dofm
23 minutes ago
> I agree that LLMs are naturally anti abstraction machines.. I'm often trying to find way to reverse that.
I am a bit of an LLM cynic but I am trying to learn it all, and I have to say I have spent most time trying to work out: how do you explain how a brown-field codebase actually works, in such a way that the LLM won't pervert it through misunderstanding.
It does encourage you towards the "conventional" coding standard for any new project, because you want to use a pattern that it will have seen in its training set.
But for example there are differences of opinion in how wordpress plugins (which have a very complex control flow) should be structured. LLMs are incredible at knowing how WP works, actually, but what is difficult is explaining how your methodology for a large plugin is going to work.
It is a battle — but a useful one because it can be used for, er, studying the comparative belief systems of the LLMs.
wonnage
6 minutes ago
They don’t have a useful belief system, one of the rookie mistakes of using LLMs is asking them what you “should” do
Thaxll
a minute ago
So you centralize 3 liners?
nfw2
17 minutes ago
Over-engineering and "abstraction hell" are very much not iconoclastic concepts
mawadev
an hour ago
I think you applied this idea into the era of LLMs but consider an abstraction that takes in multiple god structs for branches it may or may not call in the case you are looking at and has a lot of if conditions that explode in combinatory complexity across a deep call chain. Now the bottle neck is that you need to call this function 144 times a second. That is where you start to have clusters of hot code paths where the latency stacks depending on the angle the god structs come in. Not sure what LLMs do here, I don't vibe code
dofm
39 minutes ago
I am applying it to LLMs on the basis of twenty years of seeing smaller programming shops tie themselves in knots by using duplication to avoid developing an abstraction that would help them because they were unsure of it.
Everyone always thinks duplication is fine when you can bill the modifications by the hour. But they never think to understand that the reason they've had so many employees is that they've turned their change process into firefighting all the different versions of the same code and all these young developers burn out from the sheer anxiety of not knowing where all the little fires are.
I once had to rescue a site that had become a victim of its own popularity, that was written by subcontractors who clearly believed that duplication is better than the wrong abstraction.
Until one day, along came a change — MySQL 4 to MySQL 5 — and a significant duplicated query no longer worked due to its new, proper strictness.
The problem was compounded; not only was the broken pattern in hundreds of places where it had sat, stable and predictable, but the pattern was broken because it, itself, was avoidance of another abstraction that would solve it.
They quit: they said they couldn't and wouldn't fix it. It had always worked how they had done it, and it would have to stay on MySQL 4 (which the hosting provider refused to accommodate).
I don't think it helped that they were severely misguided in their understanding of SQL, but the code had become beholden to duplication and then crippled by a new problem in the duplicated pattern.
I had to first find all the contexts in which that pattern appeared (which required me to spend half a day on a bespoke script) and then work out a new pattern and as few variations of it as possible to fix the duplicated code in each place, because there was no proper budget to rewrite the whole thing. And then I sat at my desk, for days, working through each one, figuring out how to change it to fit the slightly different expression of the pattern.
Even a total bullshit abstraction would have saved that client both time and money. And this is only one of dozens of times I've seen small firms simply duplicate and change code that would later become unmaintainable because of a straw breaking a camel's back.
Capricorn2481
29 minutes ago
Again, this is the opposite of what the author argues for, which is waiting for a couple instances before committing to an abstraction. Not duplicating a SQL query across hundreds of places.
I would be curious if the previous coders you're talking about actually cited duplication as a good thing. You seem to be implying they are. But almost every instance I've seen of massive code duplication was just from bad programmers shooting from the hip, not from some ideological stance.
dofm
21 minutes ago
> Again, this is the opposite of what the author argues for, which is waiting for a couple instances before committing to an abstraction. Not duplicating a SQL query across hundreds of places.
Right. But this is a hypothetical, in-a-vacuum situation.
In the real world, your two, three duplicates are in production.
"We really should now de-duplicate this"
"There is not the time or budget, just copy it again; we'll replace all this one day".
Capricorn2481
42 minutes ago
> I would go as far as to say that any abstraction you can maintain (that is in active maintenance, I mean) is better than code duplication once you are passed a de minimis threshold.
Pretty much everyone arguing for duplication has argued what you are saying, which is wait to see a few instances of it before committing to an abstraction. No one is saying duplicate everything 100 times. So I don't think this discussion was ever iconoclastic.
dofm
27 minutes ago
The point is it sounds all smart and sophisticated and principled in the abstract environment of a code discussion in a blog post.
In the real world, duplication happens in an emergent way, there isn't the time each time to judge whether it's really time to just quietly abstract that code, you may not get the permission, budget or window to do it, and if you don't stop the rot really early you are locked into the pattern.
tracerbulletx
23 minutes ago
Huh? If anything having lots of customers makes the argument for duplication stronger. The issue is almost always once you get huge and 5 product teams are trying to achieve 5 different goals by using the same overwrought abstraction instead of just copying and decoupling. The abstractions that are actually stable end up becoming libraries or platform team owned systems that no one ever really touches.