hackernews client

My AI skeptic friends are all nuts

2312 pointsposted 6 days ago

2676 Comments

matthewsinclair

5 days ago

I think this article is pretty spot on — it articulates something I’ve come to appreciate about LLM-assisted coding over the past few months.

I started out very sceptical. When Claude Code landed, I got completely seduced — borderline addicted, slot machine-style — by what initially felt like a superpower. Then I actually read the code. It was shockingly bad. I swung back hard to my earlier scepticism, probably even more entrenched than before.

Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.

It’s still absolute rubbish if you just let it run wild, which is why I think “vibe coding” is basically just “vibe debt” — because it just doesn’t do what most (possibly uninformed) people think it does.

But if you treat it as a collaborator — more like an idiot savant with a massive brain but no instinct or nous — or better yet, as a mech suit [0] that needs firm control — then something interesting happens.

I’m now at a point where working with Claude Code is not just productive, it actually produces pretty good code, with the right guidance. I’ve got tests, lots of them. I’ve also developed a way of getting Claude to document intent as we go, which helps me, any future human reader, and, crucially, the model itself when revisiting old code.

What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.

Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”? There’s no counterfactual, so who knows? But change was inevitable.

What’s clear is this tech is here now, and complaining about it feels a bit like mourning the loss of punch cards when terminals showed up.

[0]: https://matthewsinclair.com/blog/0178-why-llm-powered-progra...

wpietri

5 days ago

One of the things I think is going on here is a sort of stone soup effect. [1]

Core to Ptacek's point is that everything has changed in the last 6 months. As you and I presume he agree, the use of off-the-shelf LLMs in code was kinda garbage. And I expect the skepticism he's knocking here ("stochastic parrots") was in fact accurate then.

But it did get a lot of people (and money) to rush in and start trying to make something useful. Like the stone soup story, a lot of other technology has been added to the pot, and now we're moving in the direction of something solid, a proper meal. But given the excitement and investment, it'll be at least a few years before things stabilize. Only at that point can we be sure about how much the stone really added to the soup.

Another counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say.

So I'm still skeptical of the hype. After all, the hype is basically the same as 6 months ago, even though now the boosters can admit the products of 6 months ago sucked. But I can believe we're in the middle of a revolution of developer tooling. Even so, I'm content to wait. We don't know the long term effects on a code base. We don't know what these tools will look like in 6 months. I'm happy to check in again then, where I fully expect to be again told: "If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing." At least until then, I'm renewing my membership in the Boring Technology Club: https://boringtechnology.club/

[1] https://en.wikipedia.org/wiki/Stone_Soup

keeda

5 days ago

> Core to Ptacek's point is that everything has changed in the last 6 months.

This was actually the only point in the essay with which I disagree, and it weakens the overall argument. Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you.

I wrote this comment elsewhere: https://news.ycombinator.com/item?id=44164846 -- Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions, but if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

wpietri

4 days ago

>if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

Maybe? Social proof doesn't mean much to me during a hype cycle. You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in. People are extremely good at fooling themselves. There are a lot of extremely smart people following all of the world's major religions, for example, and they can't all be right. And whatever else is going on here, there are a lot of very talented people whose fortunes and futures depend on convincing everybody that something extraordinary is happening here.

I'm glad you have found something that works for you. But I talk with a lot of people who are totally convinced they've found something that makes a huge difference, from essential oils to functional programming. Maybe it does for them. But personally, what works for me is waiting out the hype cycle until we get to the plateau of productivity. Those months that you spent figuring out what worked are months I'd rather spend on using what I've already found to work.

tptacek

4 days ago

The problem with this argument is that if I'm right, the hype cycle will continue for a long time before it settles (because this is a particularly big problem to have made a dent in), and for that entire span of time skepticism will have been the wrong position.

wpietri

3 days ago

I think it depends a lot on what you think "wrong position" means. I think skepticism only really goes wrong when it refuses to see the truth in what it's questioning long past the point where it's reasonable. I don't think we're there yet. For example, questions like "What is the long term effect on a code base" take us seeing the long term. Or there are legitimate questions about the ROI of learning and re-learn rapidly changing tools. What's worth it to you may not be in other situations.

I also think hype cycles and actual progress can have a variety of relationships. After Bubble 1.0 burst, there were years of exciting progress without a lot of hype. Maybe we'll get something similar here, as reasonable observers are already seeing the hype cycle falter. E.g.: https://www.economist.com/business/2025/05/21/welcome-to-the...

And of course, it all hinges on you being right. Which I get you are convinced of, but if you want to be thorough, you have to look at the other side of it.

tptacek

3 days ago

Well, two things. First, I spent a long time being wrong about this; I definitely looked at the other side. Second, the thing I'm convinced of is kind of objective? Like: these things build working code that clears quality thresholds.

But none of that really matters; I'm not so much engaging on the question of whether you are sold on LLM coding (come over next weekend though for the grilling thing we're doing and make your case then!). The only thing I'm engaging on here is the distinction between the hype cycle, which is bad and will get worse over the coming years, and the utility of the tools.

wpietri

3 days ago

Thanks! If I can make it I will. (The pinball museum project is sucking up a lot of my time as we get toward launch. You should come by!)

I think that is one interesting question that I'll want to answer before adoption on my projects, but it definitely isn't the only one.

And maybe the hype cycle will get worse and maybe it won't. Like The Economist, I'm starting to see a turn. The amount of money going into LLMs generally is unsustainable, and I think OpenAI's recent raise is a good example: round 11, $40 billion dollar goal, which they're taking in tranches. Already the largest funding round in history, and it's not the last one they'll need before they're in the black. I could easily see a trough of disillusionment coming in the next 18 months. I agree programming tools could well have a lot of innovation over the next few years, but if that happens against a backdrop of "AI" disillusionment, it'll be a lot easier to see what they're actually delivering.

mplanchard

4 days ago

So? The better these tools get, the easier they will be to get value out of. It seems not unwise to let them stabilize before investing the effort and getting the value out, especially if you’re working in one of the areas/languages where they’re still not as useful.

Learning how to use a tool once is easy, relearning how to use a tool every six months because of the rapid pace of change is a pain.

tptacek

4 days ago

This isn't responsive to what I wrote. Letting the tools stabilize is one thing, makes perfect sense. "Waiting until the hype cycle dies" is another.

mplanchard

4 days ago

I suspect the hype cycle and the stabilization curves are relatively in-sync. While the tools are constantly changing, there's always a fresh source of hype, and a fresh variant of "oh you're just not using the right/newest/best model/agent/etc." from those on the hype train.

tptacek

4 days ago

This is the thing. I do not agree with that, at all. We can just disagree, and that's fine, but let's be clear about what we're disagreeing about, because the whole goddam point of this piece is that nobody in this "debate" is saying the same thing. I think the hype is going to scale out practically indefinitely, because this stuff actually works spookily well. The hype will remain irrational longer than you can remain solvent.

mplanchard

4 days ago

Well, generally, that’s just not how hype works.

A thing being great doesn’t mean it’s going to generate outsized levels of hype forever. Nobody gets hyped about “The Internet” anymore, because novel use cases aren’t being discovered at a rapid clip, and it has well and throughly integrated into the general milieu of society. Same with GPS, vaccines, docker containers, Rust, etc., but I mentioned the Internet first since it’s probably on a similar level of societal shift as is AI in the maximalist version of AI hype.

Once a thing becomes widespread and standardized, it becomes just another part of the world we live in, regardless of how incredible it is. It’s only exciting to be a hype man when you’ve got the weight of broad non-adoption to rail against.

Which brings me to the point I was originally trying to make, with a more well-defined set of terms: who cares if someone waits until the tooling is more widely adopted, easy to use, and somewhat standardized prior to jumping on the bandwagon? Not everyone needs to undergo the pain of being an early adopter, and if the tools become as good as everyone says they will, they will succeed on their merits, and not due to strident hype pieces.

I think some of the frustration the AI camp is dealing with right now is because y’all are the new Rust Evangelism Strike Force, just instead of “you’re a bad software engineer if you use a memory unsafe languages,” it’s “you’re a bad software engineer if you don’t use AI.”

scott_s

4 days ago

The tools are at the point now that ignoring them is akin to ignoring Stack Overflow posts. Basically any time you'd google for the answer to something, you might as well ask an AI assistant. It has a good chance of giving you a good answer. And given how programming works, it's usually easy to verify the information. Just like, say, you would do with a Stack Overflow post.

tptacek

4 days ago

Who you calling y'all? I'm a developer who was skeptical about AI until about 6 months ago, and then used it, and am now here to say "this shit works". That's all. I write Go, not Rust.

People have all these feelings about AI hype, and they just have nothing at all to do with what I'm saying. How well the tools work have not much at all to do with the hype level. Usually when someone says that, they mean "the tools don't really work". Not this time.

antifa

4 days ago

> You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in.

While I agree with the skepticism, what specifically is the stake here? Most code assists have usable plans in the $10-$20 range. The investors are apparently taking a much bigger risk than the consumer would be in a case like this.

Aside from the horror stories about people spending $100 in one day of API tokens for at best meh results, of course.

wpietri

3 days ago

The stake they and I were referring to is a financial interest in the success of AI. Related is the reputational impact, of course. A lot of people who may not make money do like being seen as smart and cutting edge.

But even if we look at your notion of stake, you're missing huge chunks of it. Code bases are extremely expensive assets, and programmers are extremely expensive resources. $10 a month is nothing compared to the costs of a major cleanup or rewrite.

kentonv

4 days ago

Dude. Claude Code has zero learning curve. You just open the terminal app in your code directory and you tell it what you want, in English. In the time you have spent writing these comments about how you don't care to try it now because it's probably just hype, you could have actually tried it and found out if it's just hype.

lolinder

4 days ago

I've tried Claude Code repeatedly and haven't figured out how to make it work for me on my work code base. It regularly gets lost, spins out of control, and spends a bunch of tokens without solving anything. I totally sympathize with people who find Claude Code to have a learning curve, and I'm writing this while waiting for Cursor to finish a task I gave it, so it's not like I'm unfamiliar with the tooling in general.

One big problem with Claude Code vs Cursor is that you have to pay for the cost of getting over the learning curve. With Cursor I could eat the subscription fee and then goof off for a long time trying to figure out how to prompt it well. With Claude Code a bad prompt can easily cost me $5 a pop, which (irrationally, but measurably) hurts more than the one-time monthly fee for Cursor.

kentonv

4 days ago

Claude Code actually has a flat-rate subscription option now, if you prefer that. Personally I've found the API cost to be pretty negligible, but maybe I'm out of touch. (I mean, it's one AI-generated commit, Michael. What could it cost, $5?)

Anyway, if you've tried it and it doesn't work for you, fair enough. I'm not going to tell you you're wrong. I'm just bothered by all the people who are out here posting about AI being bad while refusing to actually try it. (To be fair, I was one of them, six months ago...)

wpietri

3 days ago

I could not have, because my standards involve more than a five minute impression from a tool designed to wow people in the first five minutes. Dude.

kentonv

3 days ago

I think you're rationalizing your resistance to change. I've been there!

I have no reason to care whether you use AI or not. I'm giving you this advice just for your sake: Consider whether you are taking a big career risk by avoiding learning about the latest tools of your profession.

potatolicious

5 days ago

> "Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you."

Sure, but I would argue that the UX is the product, and that has radically improved in the past 6-12 months.

Yes, you could have produced similar results before, manually prompting the model each time, copy and pasting code, re-prompting the model as needed. I would strenuously argue that the structuring and automation of these tasks is what has made these models broadly usable and powerful.

In the same way that Apple didn't event mobile phones nor touchscreens nor OSes, but the specific combination of these things resulted in a product that was different in kind than what came before, and took over the world.

Likewise, the "putting the LLM into a structured box of validation and automated re-prompting" is huge! It changed the product radically, even if its constituent pieces existed already.

[edit] More generally I would argue that 95% of the useful applications of LLMs aren't about advancing the SOTA model capabilities and more about what kind of structured interaction environment we shove them into.

keeda

5 days ago

For sure! I mainly meant to say that people should not attribute the "6 more months until it's really good" point as just another symptom of unfounded hype. It may have taken effort to effectively use AI earlier, which somewhat justified the caution, but now it's significantly easier and caution is counter-productive.

But I think my other point still stands: people will need to figure out for themselves how to fully exploit this technology. What worked for me, for instance, was structuring my code to be essentially functional in nature. This allows for tightly focused contexts which drastically reduces error rates. This is probably orthogonal to the better UX of current AI tooling. Unfortunately, the vast majority of existing code is not functional, and people will have to figure out how to make AI work with that.

A lot of that likely plays into your point about the work required to make useful LLM-based applications. To expand a bit more:

* AI is technology that behaves like people. This makes it confusing to reason about and work with. Products will need to solve for this cognitive dissonance to be successful, which will entail a combination of UX and guardrails.

* Context still seems to be king. My (possibly outdated) experience has been the "right" context trumps larger context windows. With code, for instance, this probably entails standard techniques like static analysis to find relevant bits of code, which some tools have been attempting. For data, this might require eliminating overfetching.

* Data engineering will be critical. Not only does it need to be very clean for good results, giving models unfettered access to the data needs the right access controls which, despite regulations like GDPR, are largely non-existent.

* Security in general will need to be upleveled everywhere. Not only can models be tricked, they can trick you into getting compromised, and so there need to even more guardrails.

A lot of these are regular engineering work that is being done even today. Only it often isn't prioritized because there are always higher priorities... like increasing shareholder value ;-) But if folks want to leverage the capabilities of AI in their businesses, they'll have to solve all these problems for themselves. This is a ton of work. Good thing we have AI to help out!

gopher_space

5 days ago

I don't think it's possible to understand what people mean by force multiplier re AI until you use it to teach yourself a new domain and then build something with that knowledge.

Building a mental model of a new domain by creating a logical model that interfaces with a domain I'm familiar with lets me test my assumptions and understanding in real time. I can apply previous experience by analogy and verify usefulness/accuracy instantly.

> Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions[...]

Part of the hype problem is that describing my experience sounds like bullshit to anyone who hasn't gone through the same process. The rate that I pick up concepts well enough to do verifiable work with them is literally unbelievable.

mwarkentin

4 days ago

AI posts (including this one) are all over his employers blog lately, so there’s some stake (fly MCP, https://fly.io/blog/fuckin-robots/, etc).

xpe

5 days ago

Almost by definition, one should be skeptical about hype. So we’re all trying to sort out what is being sold to us.

Different people have different weird tendencies in different directions. Some people irrationally assume that things aren’t going to change much. Others see a trend and irrationally assume that it will continue on a trend line.

Synthesis is hard.

Understanding causality is even harder.

Savvy people know that we’re just operating with a bag of models and trying to choose the right combination for the right situation.

This misunderstanding is one reason why doomers, accelerations, and “normies” talk past each other or (worse) look down on each other. (I’m not trying to claim epistemic equivalence here; some perspectives are based on better information, some are better calibrated than others! I’m just not laying out my personal claims at this point. Instead, I’m focusing on how we talk to each other.)

Another big source of misunderstanding is about differing loci of control. People in positions of influence are naturally inclined to think about what they can do, who they know, and where they want to be. People farther removed feel relatively powerless and tend to hold onto their notions of stability, such as the status quo or their deepest values.

Historically, programmers have been quite willing to learn new technologies, but now we’re seeing widespread examples where people’s plasticity has limits. Many developers cannot (or are unwilling to) wrap their minds around the changing world. So instead of confronting the reality they find ways to deny it, consciously or subconsciously. Our perception itself is shaped by our beliefs, and some people won’t even perceive the threat because it is too strange or disconcerting. Such is human nature: we all do it. Sometimes we’re lucky enough to admit it.

wpietri

4 days ago

I think "the reality", at least as something involving a new paradigm, has yet to be established. I'll note that I heard plenty of similar talk about how developers just couldn't adapt six months or more ago. Promoters now can admit those tools were in fact pretty bad, because they now have something else to promote, but at the time those not rawdogging LLMs were dinosaurs under a big meteor.

I do of course agree that some people are just refusing to "wrap their minds around the changing world". But anybody with enough experience in tech can count a lot more instances of "the world is about to change" than "the world really changed". The most recent obvious example being cryptocurrencies, but there are plenty of others. [1] So I think there's plenty of room here for legitimate skepticism. And for just waiting until things settle down to see where we ended up.

[1] E.g. https://www.youtube.com/watch?v=b2F-DItXtZs

xpe

4 days ago

Fair points.

Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions. That is selection bias. Many predicted disruptive changes do occur.

Most importantly, if one wants to be intellectually honest, one has to engage against a set of plausible arguments and scenarios. Debunking one particular company’s hyperbolic vision for the future might be easy, but it probably doesn’t generalize.

It is telling to see how many predictions can seem obvious in retrospect from the right frame of reference. In a sense (or more than that under certain views of physics), the future already exists, the patterns already exist. We just have to find the patterns — find the lens or model that will help the messy world make sense to us.

I do my best to put the hype to the side. I try to pay attention to the fundamentals such as scaling laws, performance over time, etc while noting how people keep moving the goalposts.

Also wrt the cognitive bias aspect: Cryptocurrencies didn’t threaten to apply significant (if any) downward pressure on the software development labor market.

Also, even cryptocurrency proponents knew deep down that it was a chicken and the egg problem: boosters might have said adoption was happening and maybe even inevitable, but the assumption was right out there in the open. It also had the warning signs of obvious financial fraud, money laundering, currency speculation, and ponzi scheming.

Adoption of artificial intelligence is different in many notable ways. Most saliently, it is not a chicken and egg problem: it does not require collective action. Anyone who does it well has a competitive advantage. It is a race.

(Like Max Tegmark and others, I view racing towards superintelligence as a suicide race, not an arms race. This is a predictive claim that can be debated by assessing scenarios, understanding human nature, and assigning probabilities.)

wpietri

3 days ago

> Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions.

I specifically said: "But anybody with enough experience in tech can count a lot more instances of 'the world is about to change' than 'the world really changed'. I pretty clearly understand that sometimes the world does change.

Funnily, I find it suspect when people accuse me of failing to do things I did in the very post they're responding to. So I think this is a fine time for us both to find better ways to spend our time.

xpe

3 days ago

Sorry, I can see why you might take that the wrong way. In my defense, I consciously wrote "generally speaking" in the hopes you wouldn't think I was referring to you in particular. I wasn't trying to accuse you of anything.

I strive to not criticize people indirectly: my style is usually closer to say New York than San Francisco. If I disagree with something in particular, I try to make that clear without beating around the bush.

deanCommie

3 days ago

> Promoters now can admit those tools were in fact pretty bad

Relative to what came after, which noone could predict would be guaranteed?

The Model T was in fact pretty bad relative to what came after...

> because they now have something else to promote

something else which is better?

i don't understand the inherent cynicism here.

spaceman_2020

5 days ago

I’m an amateur coder and I used to rely on Cursor a lot to code when I was actively working on hobby apps about 6 months ago

I picked coding again a couple of days back and I’m blown away by how much things have changed

It was all manual work until a few months back. Suddenly, its all agents

wpietri

4 days ago

> You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

spaceman_2020

4 days ago

My 80-year old dad tells me that when he bought his first car, he could pop open the hood and fiddle with things and maybe get it to work after a breakdown

Now he can't - it's too closed and complicated

Yet, modern cars are way better and almost never breakdown

Don't see how LLMs are any different than any other tech advancement that obfuscates and abstracts the "fundamentals".

wpietri

3 days ago

I definitely believe that you don't see it. We just disagree on what that implies.

wpietri

3 days ago

Oops, this was a reply to somebody else put in the wrong place. Sorry for the confusion.

DannyBee

5 days ago

"nother counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say."

You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I wish i could impress this upon more people.

A friend similarly used to lament/complain that Kotlin sucked in part because we could have probably accomplished it's major features in Java, and maybe without tons of work, or migration cost.

This is maybe even true!

as an intellectual exercise, both are interesting to think about. But outside of that, people get caught up in this as if it matters, but it doesn't.

Basically nothing is driven by pure technical merit alone, not just in CS, but in any field. So my point to him was the lesson to take away from this is not "we could have been more effective or done it cheaper or whatever" but "my definition of effectiveness doesn't match how reality decides effectiveness, so i should adjust my definition".

As much as people want the definition to be a meritocracy, it just isn't and honestly, seems unlikely to ever be.

So while it's 100% true that billions of dollars dumped into other tools or approaches or whatever may have have generated good, better, maybe even amazing results, they weren't, and more importantly, never would have been. Unknown but maybe infinite ROI is often much more likely to see investment than more known but maybe only 2x ROI.

and like i said, this is not just true in CS, but in lots of fields.

That is arguably quite bad, but also seems unlikely to change.

wpietri

3 days ago

> You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

DannyBee

3 days ago

Sure, and that works in the abstract (ie "what investment would theoretically have made the most sense") but if you are trying to compare in the real world you have to be careful because it assumes the alternative would have ever happened. I doubt it would have.

raxxorraxor

5 days ago

The better I am at solving a problem, the less I use AI assistants. I use them if I try a new language or framework.

Busy code I need to generate is difficult to do with AI too. Because then you need to formalize the necessary context for an AI assistant, which is exhausting with an unsure result. So perhaps it is just simpler to write it yourself quickly.

I understand comments being negative, because there is so much AI hype without having to many practical applications yet. Or at least good practical applications. Some of that hype is justified, some of it is not. I enjoyed the image/video/audio synthesis hype more tbh.

Test cases are quite helpful and comments are decent too. But often prompting is more complex than programming something. And you can never be sure if any answer is usable.

Cthulhu_

5 days ago

> But often prompting is more complex than programming something.

I'd challenge this one; is it more complex, or is all the thinking and decision making concentrated into a single sentence or paragraph? For me, programming something is taking a big high over problem and breaking it down into smaller and smaller sections until it's a line of code; the lines of code are relatively low effort / cost little brain power. But in my experience, the problem itself and its nuances are only defined once all code is written. If you have to prompt an AI to write it, you need to define the problem beforehand.

It's more design and more thinking upfront, which is something the development community has moved away from in the past ~20 years with the rise of agile development and open source. Techniques like TDD have shifted more of the problem definition forwards as you have to think about your desired outcomes before writing code, but I'm pretty sure (I have no figures) it's only a minority of developers that have the self-discipline to practice test-driven development consistently.

(disclaimer: I don't use AI much, and my employer isn't yet looking into or paying for agentic coding, so it's chat style or inline code suggestions)

sksisoakanan

5 days ago

The issue with prompting is English (or any other human language) is nowhere near as rigid or strict a language as a programming language. Almost always an idea can be expressed much more succinctly in code than language.

Combine that with when you’re reading the code it’s often much easier to develop a prototype solution as you go and you end up with prompting feeling like using 4 men to carry a wheelbarrow instead of having 1 push it.

michaelfeathers

5 days ago

I think we are going to end up with common design/code specification language that we use for prompting and testing. There's always going to be a need to convey the exact semantics of what we want. If not, for AI then for the humans who have to grapple with what is made.

rerdavies

5 days ago

Sounds like "Heavy process". "Specifying exact semantics" has been tried and ended up unimaginably badly.

bcrosby95

5 days ago

Nah, imagine a programming language optimized for creating specifications.

Feed it to an llm and it implements it. Ideally it can also verify it's solution with your specification code. If LLMs don't gain significantly more general capabilities I could see this happening in the longer term. But it's too early to say.

In a sense the llm turns into a compiler.

rerdavies

4 days ago

It's an interesting idea. I get it. Although I wonder.... do you really need formal languages anymore now that we have LLMs that can take natural language specifications as input.

I tried running the idea on a programming task I did yesterday. "Create a dialog to edit the contents of THIS data structure." It did actually produce a dialog that worked the first time. Admitedly a very ugly dialog. But all the fields and labels and controls were there in the right order with the right labels, and were all properly bound to props of a react control, that was grudgingly fit for purpose. I suspect I could have corrected some of the layout issues with supplementary prompts. But it worked. I will do it again, with supplementary prompts next time.

Anyway. I next thought about how I would specify the behavior I wanted. The informal specification would be "Open the Looping dialog. Set Start to 1:00, then open the Timebase dialog. Select "Beats", set the tempo to 120, and press the back button. Verify that the Start text edit now contains "30:1" (the same time expressed in bars and beats). Set it to 10:1,press the back button, and verify that the corresponding "Loop" <description of storage for that data omited for clarity> for the currently selected plugin contains 20.0. I can actually see that working (and I plan to see if I can convince an AI to turn that into test code for me).

Any imaginable formal specification for that would be just grim. In fact, I can't imagine a "formal" specification for that. But a natural language specification seems eminently doable. And even if there were such a formal specification, I am 100% positive that I would be using natural language AI prompts to generate the specifications. Which makes me wonder why anyone needs a formal language for that.

And I can't help thinking that "Write test code for the specifications given in the previous prompt" is something I need to try. How to give my AI tooling to get access to UI controls though....

cess11

5 days ago

We've had that for a long, long time. Notably RAD-tooling running on XML.

The main lesson has been that it's actually not much of an enabler and the people doing it end up being specialised and rather expensive consultants.

CamperBob2

5 days ago

RAD before transformers was like trying to build an iPhone before capacitive multitouch: a total waste of time.

Things are different now.

cess11

5 days ago

I'm not so sure. What can you show me that you think would be convincing?

CamperBob2

5 days ago

I think there are enough examples of genuine AI-facilitated rapid application development out there already, honestly. I wouldn't have anything to add to the pile, since I'm not a RAD kind of guy.

Disillusionment seems to spring from expecting the model to be a god or a genie instead of a code generator. Some people are always going to be better at using tools than other people are. I don't see that changing, even though the tools themselves are changing radically.

cess11

4 days ago

"Nothing" would have been shorter and more convenient for us both.

soraminazuki

4 days ago

That's a straw man. Asking for real examples to back up your claims isn't overt perfectionism.

CamperBob2

4 days ago

If you weren't paying attention to what's been happening for the last couple of years, you certainly won't believe anything I have to say.

Trust me on this, at least: I don't need the typing practice.

starlust2

5 days ago

A big challenge is that programmers all have unique ever changing personal style and vision that they've never had to communicate before. As well they generally "bikeshed" and add undefined unrequested requirements, because you know someday we might need to support 10000x more users than we have. This is all well and good when the programmer implements something themselves but falls apart when it must be communicated to an LLM. Most projects/systems/orgs don't have the necessary level of detail in their documentation, documentation is fragmented across git/jira/confluence/etc/etc/etc., and it's a hodge podge of technologies without a semblance of consistency.

I think we'll find that over the next few years the first really big win will be AI tearing down the mountain of tech & documentation debt. Bringing efficiency to corporate knowledge is likely a key element to AI working within them.

mlsu

5 days ago

Efficiency to corporate knowledge? Absolutely not, no way. My coworkers are beginning to use AI to write PR descriptions and git commits.

I notice, because the amount of text has been increased tenfold while the amount of information has stayed exactly the same.

This is a torrent of shit coming down on us, that we are all going to have to deal with it. The vibe coders will be gleefully putting up PRs with 12 paragraphs of "descriptive" text. Thanks no thanks!

starlust2

4 days ago

Well I'm certainly not saying that AI should generate more corporate spam. That's part of them problem! And also a strawman argument!

aloisdg

5 days ago

Use a llm to summarize the PR /j

bcrosby95

5 days ago

I design and think upfront but I don't write it down until I start coding. I can do this for pretty large chunks of code at once.

The fastest way I can transcribe a design is with code or pseudocode. Converting it into English can be hard.

It reminds me a bit of the discussion of if you have an inner monologue. I don't and turning thoughts into English takes work, especially if you need to be specific with what you want.

averageRoyalty

5 days ago

I also don't have an inner monologue and can relate somewhat. However I find that natural language (usually) allows me to be more expressive than pseudocode in the same period of time.

There's also an intangible benefit of having someone to "bounce off". If I'm using an LLM, I am tweaking the system prompt to slow it down, make it ask questions and bug me before making changes. Even without that, writing out the idea displays quickly potential logic or approach flaws - much fast than writing pseudo in my experience.

algorithmsRcool

5 days ago

> It's more design and more thinking upfront, which is something the development community has moved away from in the past ~20 years with the rise of agile development and open source.

I agree, but even smaller than thinking in agile is just a tight iteration loop when i'm exploring a design. My ADHD makes upfront design a challenge for me and I am personally much more effective starting with a sketch of what needs to be done and then iterating on it until I get a good result.

The loop of prompt->study->prompt->study... is disruptive to my inner loop for several reasons, but a big one is that the machine doesn't "think" like i do. So the solutions it scaffolds commonly make me say "huh?" and i have to change my thought process to interpet them and then study them for mistakes. My intution and iteration is, for the time being, more effective than this machine assited loop for the really "interesting" code i have to write.

But i will say that AI has been a big time saver for more mundane tasks, especially when I can say "use this example and apply it to the rest of this code/abstraction".

samsepi01

3 hours ago

> "The loop of prompt->study->prompt->study... is disruptive to my inner loop for several reasons, but a big one is that the machine doesn't "think" like i do. So the solutions it scaffolds commonly make me say "huh?" and i have to change my thought process to interpet them and then study them for mistakes. My intution and iteration is, for the time being, more effective than this machine assisted loop..."

My thoughts exactly as an ADHD dev.

Was having trouble describing my main issue with LLM-assisted development...

Thank you for giving me the words!

avemuri

5 days ago

I agree with your points but I'm also reminded of one my bigger learnings as a manager - the stuff I'm best at is the hardest, but most important, to delegate.

Sure it was easier to do it myself. But putting in the time to train, give context, develop guardrails, learn how to monitor etc ultimately taught me the skills needed to delegate effectively and multiply the teams output massively as we added people.

It's early days but I'm getting the same feeling with LLMs. It's as exhausting as training an overconfident but talented intern, but if you can work through it and somehow get it to produce something as good as you would do yourself, it's a massive multiplier.

johnmaguire

5 days ago

I don't totally understand the parallel you're drawing here. As a manager, I assume you're training more junior (in terms of their career or the company) engineers up so they can perform more autonomously in the future.

But you're not training LLMs as you use them really - do you mean that it's best to develop your own skill using LLMs in an area you already understand well?

I'm finding it a bit hard to square your comment about it being exhausting to catherd the LLM with it being a force multiplier.

avemuri

5 days ago

No I'm talking about my own skills. How I onboard, structure 1on1s, run meetings, create and reuse certain processes, manage documentation (a form of org memory), check in on status, devise metrics and other indicators of system health. All of these compound and provide leverage even if the person leaves and a new one enters.the 30th person I onboarded and managed was orders of magnitude easier (for both of us) than the first.

With LLMs the better I get at the scaffolding and prompting, the less it feels like catherding (so far at least). Hence the comparison.

wpietri

5 days ago

Great point.

Humans really like to anthropomorphize things. Loud rumbles in the clouds? There must be a dude on top of a mountain somewhere who's in charge of it. Impressed by that tree? It must have a spirit that's like our spirits.

I think a lot of the reason LLMs are enjoying such a huge hype wave is that they invite that sort of anthropomorphization. It can be really hard to think about them in terms of what they actually are, because both our head-meat and our culture has so much support for casting things as other people.

GoblinSlayer

5 days ago

Do LLMs learn? I had an impression you borrow a pretrained LLM that handles each query starting with the same initial state.

simonw

5 days ago

No, LLMs don't learn - each new conversation effectively clears the slate and resets them to their original state.

If you know what you're doing you can still "teach" them though, but it's on you to do that - you need to keep on iterating on things like the system prompt you are using and the context you feed in to the model.

runarberg

5 days ago

This sounds like trying to glue on supervised learning post-hoc.

Makes me wonder if there had been equal investment into specialized tools which used more fine-tuned statistical methods (like supervised learning), that we would have something much better then LLMs.

I keep thinking about spell checkers and auto-translators, which have been using machine learning for a while, with pretty impressive results (unless I’m mistaken I think most of those use supervised learning models). I have no doubt we will start seeing companies replacing these proven models with an LLM and a noticeable reduction in quality.

rerdavies

5 days ago

That's mostly, but not completely true. There are various strategies to get LLMs to remember previous conversations. ChatGPT, for example, remembers (for some loose definition of "remembers") all previous conversations you've had with it.

runarberg

5 days ago

I think if you use a very loose definition of learning: A stimuli which alters subsequent behavior you can claim this is learning. But if you tell a human to replace the word “is” with “are” in the next two sentences, this could hardly be considered learning, rather it is just following commands, even though it meets the previous loose definition. This is why in psychology we usually include some timescale for how long the altered behavior must last for it to be considered learning. A short-term altered behavior is usually called priming. But even then I wouldn’t even consider “following commands” to be neither priming nor learning, I would simply call it obeying.

If an LLM learned something when you gave it commands, it would probably be reflected in some adjusted weights in some of its operational matrix. This is true of human learning, we strengthen some neural connection, and when we receive a similar stimuli in a similar situation sometime in the future, the new stimuli will follow a slightly different path along its neural pathway and result in a altered behavior (or at least have a greater probability of an altered behavior). For an LLM to “learn” I would like to see something similar.

rerdavies

4 days ago

I think you have an overly strict definition of what "learning" means. ChatGPT now has memory that lasts beyond the lifetime of it's context buffer, and now has at least medium term memory. (Actually I'm not entirely sure that they are not just using long persistent context buffers, but anyway).

Admittedly, you have to wrap LLMs to with stuff to get them to do that. If you want to rewrite the rules to excluded that then I will have to revise my statement that it is "mostly, but not completely true".

:-P

runarberg

4 days ago

You also have to alter some neural pathways in your brain to follow commands. That doesn’t make it learning. Learned behavior is usually (but not always) reflected in long term changes to neural pathways outside of the language centers of the brain, and outside of the short-term memory. Ones you forget the command, and still apply the behavior, that is learning.

I think SSR schedulers are a good example of a Machine Learning algorithms that learns from it’s previous interactions. If you run the optimizer you will end up with a different weight matrix, and flashcards will be schedule differently. It has learned how well you retain these cards. But an LLM that is simply following orders has not learned anything, unless you feed the previous interaction back into the system to alter future outcomes, regardless of whether it “remembers” the original interactions. With the SSR, your review history is completely forgotten about. You could delete it, but the weight matrix keeps the optimized weights. If you delete your chat history with ChatGPT, it will not behave any differently based on the previous interaction.

simonw

4 days ago

I'd count ChatGPT memory as a feature of ChatGPT, not of the underlying LLM.

I wrote a bit about that here - I've turned it off: https://simonwillison.net/2025/May/21/chatgpt-new-memory/

bodegajed

5 days ago

Yes with few shots. you need to provide at least 2 examples of similar instructions and their corresponding solutions. But when you have to build few shots every time you prompt it feels like you're doing the work already.

Edit: grammar

conartist6

5 days ago

But... But... the multiplier isn't NEW!

You just explained how your work was affected by a big multiplier. At the end of training an intern you get a trained intern -- potentially a huge multiplier. ChatGPT is like an intern you can never train and will never get much better.

These are the same people who would no longer create or participate deeply in OSS (+100x multipler) bragging about the +2x multiplier they got in exchange.

conartist6

5 days ago

The first person you pass your knowledge onto can pass it onto a second. ChatGPT will not only never build knowledge, it will never turn from the learner to the mentor passing hard-won knowledge on to another learner.

brulard

5 days ago

> But often prompting is more complex than programming something. It may be more complex, but it is in my opinion better long term. We need to get good at communicating with AIs to get results that we want. Forgive me assuming that you probably didn't use these assistants long enough to get good at using them. I'm web developer for 20 years already and AI tools are multiplying my output even in problems I'm very good at. And they are getting better very quickly.

GoblinSlayer

5 days ago

Yep, it looks like LLMs are used as fast typists, and coincidentally in webdev typing speed is the most important bottleneck when you need to add cookie consent, spinners, dozens of ad providers, tracking pixels, twitter metadata, google metadata, manual rendering, buttons web components with material design and react, hover panels, fontawesome, recaptcha, and that's only 1% of modern web boilerplate, then it's easy to see how a fast typist can help you.

echelon

5 days ago

> The better I am at solving a problem, the less I use AI assistants.

Yes, but you're expensive.

And these models are getting better at solving a lot of business-relevant problems.

Soon all business-relevant problems will be bent to the shape of the LLM because it's cost-effective.

onemoresoop

5 days ago

You're forgetting how much money is being burned in keeping these LLMs cheap. Remember when Uber was a fraction of the cost of a cab? Yeah, those days didn't last.

averageRoyalty

5 days ago

> Remember when Uber was a fraction of the cost of a cab? Yeah, those days didn't last.

They're still much cheaper where I am. But regardless, why not take the Uber while it's cheaper?

There's the argument of the taxi industry collapsing (it hasn't yet). Is your concern some sort of long term knowledge loss from programmers and a rug pull? There are many good LLM options out there, they're getting cheaper and the knowledge loss wouldn't be impactful (and rug pull-able) for at least a decade or so.

ido

4 days ago

Even at 100x the cost (currently $20/month for most of these via subscriptions) it’s still cheaper than an intern, let alone a senior dev.

a4isms

5 days ago

I have been in this industry since the mid 80s. I can't tell you how many people worry that I can't handle change because as a veteran, I must cling to what was. Meanwhile, of course, the reason I am still in the industry is because of my plasticity. Nothing is as it was for me, and I have changed just about everything about how I work multiple times. But what does stay the same all this time are people and businesses and how we/they behave.

Which brings me to your comment. The comparison to Uber drivers is apt, and to use a fashionable word these days, the threat to people and startups alike is "enshittification." These tools are not sold, they are rented. Should a few behemoths gain effective control of the market, we know from history that we won't see these tools become commodities and nearly free, we'll see the users of these tools (again, both people and businesses) squeezed until their margins are paper-thin.

Back when articles by Joel Spolsky regularly hit the top page of Hacker News, he wrote "Strategy Letter V:" https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

The relevant takeaway was that companies try to commoditize their complements, and for LLM vendors, every startup is a complement. A brick-and-mortar metaphor is that of a retailer in a mall. If you as a retailer are paying more in rent than you're making, you are "working for the landlord," just as if you are making less than 30% of profit on everything you sell or rent through Apple's App Store, you're working for Apple.

I once described that as "Sharecropping in Apple's Orchard," and if I'm hesitant about the direction we're going, it's not anything about clinging to punch cards and ferromagnetic RAM, it's more the worry that it's not just a question of programmers becoming enshittified by their tools, it's also the entire notion of a software business "Sharecropping the LLM vendor's fields."

We spend way too much time talking about programming itself and not enough about whither the software business if its leverage is bound to tools that can only be rented on terms set by vendors.

--------

I don't know for certain where things will go or how we'll get there. I actually like the idea that a solo founder could create a billion-dollar company with no employees in my lifetime. And I have always liked the idea of software being "Wheels for the Mind," and we could be on a path to that, rather than turning humans into "reverse centaurs" that labour for the software rather than the other way around.

Once upon a time, VCs would always ask a startup, "What is your Plan B should you start getting traction and then Microsoft decides to compete with you/commoditize you by giving the same thing away?" That era passed, and Paul Graham celebrated it: https://paulgraham.com/microsoft.html

Then when startups became cheap to launch—thank you increased tech leverage and cheap money and YCombinator industrializing early-stage venture capital—the question became, "What is your moat against three smart kids launching a competitor?"

Now I wonder if the key question will bifurcate:

1. What is your moat against somebody launching competition even more cheaply than smart kids with YCombinator's backing, and;

2. How are you insulated against the cost of load-bearing tooling for everything in your business becoming arbitrarily more expensive?

soraminazuki

4 days ago

Actually, I agree. It won't be long before businesses handle software engineering like Google does "support." You know, that robotic system that sends out passive-aggressive mocking emails to people who got screwed over by another robot that locks them out of their digital lives for made up reasons [1]. It saves the suits a ton of cash while letting them dodge any responsibility for the inevitable harm it'll cause to society. Mediocrity will be seen as a feature, and the worst part is, the zealots will wave it like a badge of honor.

[1]: https://news.ycombinator.com/item?id=26061935

fsloth

5 days ago

I totally agree. The ”hard to control mech suit” is an excellent analogy.

When it works it’s brilliant.

There is a threshold point as part of the learning curve where you realize you are in a pile of spaghetti code and think it actually saves no time to use LLM assistant.

But then you learn to avoid the bad parts - thus they don’t take your time anymore - and the good parts start paying back in heaps of the time spent learning.

They are not zero effort tools.

There is a non-trivial learning cost involved.

teaearlgraycold

5 days ago

The issue is we’re too early in the process to even have a solid education program for using LLMs. I use them all the time and continue to struggle finding an approach that works well. It’s easy to use them for documentation look up. Or filling in boilerplate. Sometimes they nail a transformation/translation task, other times they’re more trouble than they’re worth.

We need to understand what kind of guard rails to put these models on for optimal results.

fsloth

5 days ago

” we’re too early in the process to even have a solid education program for using LLMs”

We don’t even have a solid education program for software engineering - possibly for the same reason.

The industry loves to run on the bleeding edge, rather than just think for a minute :)

baq

5 days ago

when you stop to think, your fifteen (...thousand) competitors will all attempt a different version of the thing you're thinking about and one of them will be the about the thing you'll come up with, except it'll be built.

it might be ok since what you were thinking about is probably not a good idea in the first place for various reasons, but once in a while stars align to produce the unicorn, which you want to be if you're thinking about building something.

caveat: maybe you just want to build in a niche, it's fine to think hard in such places. usually.

fsloth

5 days ago

Fwiw a legion of wishfull app developers is not ”the industry”. It’s fine for individuals to move fast.

Institution scale lack of deep thinking is the main issue.

soraminazuki

4 days ago

> We don’t even have a solid education program for software engineering - possibly for the same reason.

There's an entire field called computer science. ACM provides curricular recommendations that it updates every few years. People spend years learning it. The same can't be said about the field of, prompting.

fsloth

4 days ago

But nobody seems to trust any formally specified education, hence practices like whiteboarding as part of job interviews.

How do we know a software engineer is competent? We can’t tell, and damned if we trust that msc he holds.

Computer science, while fundamental, is very little of help in the emergent large scale problems which ”software engineering” tries to tackle.

The key problem is converting capital investment to a working software with given requirements and this is quite unpredictable.

We don’t know how to effectively train software engineers so that software projects would be predictable.

We don’t know how to train software engineers so that employers would trust their degrees as a strong signal of competence.

If there is a university program that, for example FAANGM (or what ever letters forms the pinnacle of markets) companies respect as a clear signal of obvious competence as a software engineer I would like to know what that is.

soraminazuki

4 days ago

That says more about the industry than the quality of formal education. After all, it's the very same industry that's hailing mediocre robots as replacements of human software engineers. Even the article has this to say:

> As a mid-late career coder, I’ve come to appreciate mediocrity.

Then there's also the embracement of anti-intellectualism. "But I don't want to spend time learning X!" is a surprisingly common comment on, er, Hacker News.

So yeah, no surprise that formal education is looked down on. Doesn't make it right though.

jes5199

5 days ago

also, the agents are actually pretty good at cleaning up spaghetti if you do it one module at a time, use unit tests. And some of the models are smart enough to suggest good organization schemes!

tptacek

5 days ago

For what it's worth: I'm not dismissive of the idea that these things could be ruinous for the interests of the profession. I don't automatically assume that making applications drastically easier to produce is just going to make way for more opportunities.

I just don't think the interest of the profession control. The travel agents had interests too!

hostyle

5 days ago

For a long time there has been back chatter on how to turn programming into a more professional field, more like actual engineering where when something goes wrong actual people and companies start to take security seriously, and get held accountable for their mistakes, and start to actually earn their high salaries.

Getting AI to hallucinate its way into secure and better quality code seems like the antithesis of this. Why don't we have AI and robots working for humanity with the boring menial tasks - mowing laws, filing taxes, washing dishes, driving cars - instead of attempting to take on our more critical and creative outputs - image generation, movie generation, book writing and even website building.

tptacek

5 days ago

The problem with this argument is that it's not what's going to happen. In the trajectory I see of LLM code generation, security quality between best-practices well-prompted (ie: not creatively well prompted, just people with a decent set of Instructions.md or whatever) and well trained human coders is going to be a wash. Maybe in 5 years SOTA models will clearly exceed human coders on this, but my premise is all progress stops and we just stick with what we have today.

But the analysis doesn't stop there, because after the raw quality wash, we have to consider things LLMs can do profoundly better than human coders can. Codebase instrumentation, static analysis, type system tuning, formal analysis: all things humans can do, spottily, on a good day but that empirically across most codebases they do not do. An LLM can just be told to spend an afternoon doing them.

I'm a security professional before I am anything else (vulnerability research, software security consulting) and my take on LLM codegen is that they're likely to be a profound win for security.

pxc

4 days ago

Isn't formal analysis exactly the kind of thing LLMs can't do at all? Or do you mean an LLM invoking a proof assistant or something like that?

tptacek

4 days ago

Yes, I mean LLMs generating proof specs and invoking assistants, not that they themselves do any formal modeling.

epiccoleman

5 days ago

> Why don't we have AI and robots working for humanity with the boring menial tasks - mowing laws, filing taxes, washing dishes, driving cars

I mean, we do have automation for literally all of those things, to varying degrees of effectiveness.

There's an increasing number of little "roomba" style mowers around my neighborhood. I file taxes every year with FreeTaxUSA and while it's still annoying, a lot of menial "form-filling" labor has been taken away from me there. My dishwasher does a better job cleaning my dishes than I would by hand. And though there's been a huge amount of hype-driven BS around 'self-driving', we've undeniably made advances in that direction over the last decade.

ivape

5 days ago

Soon as the world realized they don't need a website and can just have FB/Twitter page, a huge percentage of freelance web development gigs just vanished. We have to get real about what's about to happen. The app economy filled the gap, and the only optimistic case is the AI app industry is what's going to fill the gap going forward. I just don't know about that. There's a certain end-game vibes I'm getting because we're talking about self-building and self-healing software. More so, a person can ask the AI to role play anything, even an app.

tptacek

5 days ago

Sure. And before the invention of the spreadsheet, the world's most important programming language, individual spreadsheets were something a programmer had to build for a business.

Earw0rm

5 days ago

Except that FB/Twitter are rotting platforms. I don't pretend that freelance web dev is a premium gig, but setting up Wordpress sites for local flower shops etc. shouldn't require a higher level of education/sophistication than e.g. making physical signs for the same shops.

Technical? Yes. Hardcore expert premium technical, no. The people who want the service can pay someone with basic to moderate skills a few hundred bucks to spend a day working on it, and that's all good.

Could I get an LLM to do much of the work? Yes, but I could also do much of the work without an LLM. Someone who doesn't understand the first principles of domains, Wordpress, hosting and so on, not so much.

ivape

4 days ago

Except that FB/Twitter are rotting platforms.

They were not rotting platforms when they evaporated jobs at that particular moment, about 10-15 years ago. There's no universe where people are making money making websites. One could easily collect multi thousand dollars per month just making websites awhile ago before twitter/fb pages just on the side. There is a long history to web development.

Also, the day of the website has been over for quite awhile so I don't even buy the claim that social media is a rotting platform.

daveguy

5 days ago

None of the LLM models are self-building, self-healing or even self-thinking or self-teaching. They are static models (+rag, but that's a bolt-on). Did you have a specific tech in mind?

soraminazuki

4 days ago

> We have to get real about what's about to happen.

Or maybe shouldn't enthusiastically repeat the destruction of the open web in favor of billionaire-controlled platforms for surveillance and manipulation.

rustcleaner

4 days ago

Start getting to be friends with some billionaire (or... shh... trillionaire) families, Elysium is coming!

nonameiguess

5 days ago

It's kind of ironic to me that this is so often the example trotted out. Look at the BLS data sheet for job outlook: https://www.bls.gov/ooh/sales/travel-agents.htm#tab-6

> Employment of travel agents is projected to grow 3 percent from 2023 to 2033, about as fast as the average for all occupations.

The last year there is data for claims 68,800 people employed as travel agents in the US. It's not a boom industry by any means, but it doesn't appear they experienced the apocalypse that Hacker News believes they did, either.

I don't know how to easily find historical data, unfortunately. BLS publishes the excel sheets, but pulling out the specific category would have to be done manually as far as I can tell. There's this, I guess: https://www.travelagewest.com/Industry-Insight/Business-Feat...

It appears at least that what happened is, though it may be easier than ever to plan your own travel, there are so many more people traveling these days than in the past that the demand for travel agents hasn't crashed.

pvg

5 days ago

https://www.vice.com/en/article/why-are-travel-agents-still-...

Has some stats. It seems pretty clear the interests of travel agents did not count for much in the face of technological change.

rerdavies

5 days ago

https://fred.stlouisfed.org/series/LEU0254497900A

40% of all travel agent jobs lost between 2001 and 2025. Glad I'm not a travel agent.

rerdavies

4 days ago

500,000 tech R&D jobs lost since 2017... Glad I'm not... Oh. Wait I AM!! Probably due to toxic Trumpian tax changes, though.

soraminazuki

4 days ago

Let's be real. Software engineers are skeptical right now not because they believe robots are better than them. Quite the opposite. The suits will replace software engineers despite its mediocrity.

It was just 2 weeks ago when the utter incompetence of these robots were in full public display [1]. But none of that will matter to greedy corporate executives, who will prioritize short-term cost savings. They will hop from company to company, personally reaping the benefits while undermining essential systems that users and society rely on with robot slop. That's part of the reason why the C-suites are overhyping the technology. After all, no rich executive has faced consequences for behaving this way.

It's not just software engineering jobs that will take a hit. Society as a whole will suffer from the greedy recklessness.

[1]: https://news.ycombinator.com/item?id=44050152

chinchilla2020

5 days ago

The reason I remain in the "skeptical" camp is because I am experiencing the same thing you are - I keep oscillating between being impressed, then disappointed.

Ultimately the thing that impresses me is that LLMs have replaced google search. The thing that disappoints me is that their code is often convincing but wrong.

Coming from a hard-engineering background, anything that is unreliable is categorized as bad. If you come from the move-fast-break-things world of tech, then your tolerance for mistakes is probably a lot higher.

saltcured

5 days ago

This is a bit tangential, but isn't that partly because google search keeps evolving into a worse resource due to the SEO garbage race?

didibus

5 days ago

It is, AI lets you have an ad-free web browsing experience. This is a huge part of it as well.

pxc

5 days ago

LLM-generated blogspam is also accelerating this process

wussboy

5 days ago

And are LLMs immune to that same SEO garbage race?

osigurdson

5 days ago

I have been using Windsurf for a few months and ChatGPT for a couple of years. I don't feel Windsurf is a massive game changer personally. It is good if you are very tired or working in a new area (also good for exploring UI ideas as the feedback loop is tight), but still not a real game changer over ChatGPT. Waiting around for it to do its thing ("we've encountered at error - no credits used") is boring and flow destroying. Of you know exactly what you are doing the productivity is probably 0.5 vs just typing the code in yourself. Sorry, I'm not going to bang around in Windsurf all day just to help with the training so that "v2" can be better. They should be paying me for this realistically.

Of course, in aggregate AI makes me capable in a far broader set of problem domains. It would be tough to live without it at this stage, but needs to be used for what it is actually good at, not what we hope it will be good at.

ketzo

5 days ago

Have you tried Cursor or Zed? I find they’re both significantly better in their “agent” modes than Windsurf.

osigurdson

5 days ago

I used Cursor before Windsurf but I have not used Zed.

belter

5 days ago

> What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.

I tried the latest Claude for a very complex wrapper around the AWS Price APIs who are not easy to work with. Down a 2,000 line of code file, I found Claude faking some API returns by creating hard coded values. A pattern I have seen professional developers being caught on while under pressure to deliver.

This will be a boon to the human skilled developers, that will be hired at $900 dollars an hour to fix bugs of a subtlety never seen before.

rollcat

5 days ago

More or less this. Maybe a job opportunity, but many decision makers won't see the real problem until they get hit by that AWS bill. Ironic, if the business won't hire you because they went out of business?

DontchaKnowit

5 days ago

I mean, that bug doesnt seem very subtle.

belter

5 days ago

I swear this is not me..

"Claude gives up and hardcodes the answer as a solution" - https://www.reddit.com/r/ClaudeAI/comments/1j7tiw1/claude_gi...

belter

5 days ago

I did not want to bend the truthfulness of my story, to make a valid logical argument more convincing... :-)

osigurdson

5 days ago

The arguments seem to come down to tooling. The article suggests that ChatGPT isn't a good way to interact with LLMs but I'm not so sure. If the greatest utility is "rubber ducking" and editing the code yourself is necessary then tools like Cursor go too far in a sense. In my own experience, Windsurf is good for true vibe coding where I just want to explore an idea and throw away the code. It is still annoying though as it takes so long to do things - ruining any kind of flow state you may have. I am conversing with ChatGPT directly much more often.

I haven't tried Claud code yet however. Maybe that approach is more on point.

eleveriven

5 days ago

Totally agree with "vibe debt". Letting an LLM off-leash without checks is a fast track to spaghetti. But with tests, clear prompts, and some light editing, I’ve shipped a lot of real stuff faster than I could have otherwise.

throwawayffffas

5 days ago

I generally agree with the attitude of the original post as well. But I stick one one point. It definitely doesn't cost 20 dollars a month, cursor.ai might and I don't know how good it is, but claude code costs hundreds of dollars a month, still cheaper than a junior dev though.

munificent

5 days ago

> Did Photoshop kill graphic artists? Did film kill theatre?

To a first approximation, the answer to both of these is "yes".

There is still a lot of graphic design work out there (though generative AI will be sucking the marrow out of it soon), but far less than there used to be before the desktop publishing revolution. And the kind of work changed. If "graphic design" to you meant sitting at a drafting table with pencil and paper, those jobs largely evaporated. If that was a kind of work that was rewarding and meaningful to you, that option was removed for you.

Theatre even more so. Yes, there are still some theatres. But the number of people who get to work in theatrical acting, set design, costuming, etc. is a tiny tiny fraction of what it used to be. And those people are barely scraping together a living, and usually working side jobs just to pay their bills.

> it feels a bit like mourning the loss of punch cards when terminals showed up.

I think people deserve the right to mourn the loss of experiences that are meaningful and enjoyable to them, even if those experiences turn out to no longer be maximally economically efficient according to the Great Capitalistic Moral Code.

Does it mean that we should preserve antiquated jobs and suffer the societal effects of inefficiency without bound? Probably not.

But we should remember that the ultimate goal of the economic system is to enable people to live with meaning and dignity. Efficiency is a means to that end.

pvg

5 days ago

But the number of people who get to work in theatrical acting, set design, costuming

I think this ends up being recency bias and terminology hairsplitting, in the end. The number of people working in theatre mask design went to nearly zero quite a while back but we still call the stuff in the centuries after that 'theatre' and 'acting'.

munificent

5 days ago

I'm not trying to split hairs.

I think "theatre" is a fairly well-defined term to refer to live performances of works that are not strictly musical. Gather up all of the professions necessary to put those productions on together.

The number of opportunities for those professions today is much smaller than it was a hundred years ago before film ate the world.

There are only so many audience members and a night they spend watching a film or watching TV or playing videogames is a night they don't spend going to a play. The result is much smaller audiences. And with fewer audiences, there are fewer plays.

Maybe I should have been clearer that I'm not including film and video production here. Yes, there are definitely opportunities there, though acting for a camera is not at all the same experience as acting for a live audience.

rightbyte

5 days ago

> I think "theatre" is a fairly well-defined term to refer to live performances of works

Doesn't it mean cinema too? edit: Even though it was clear from context you meant live theatre.

pvg

5 days ago

Right but modern theatre is pretty new itself. The number of people involved in performance for the enjoyment of others has spiked, err, dramatically. My point is that making this type argument seems to invariably involve picking some narrow thing and elevating it to a true and valuable artform deserving special consideration and mourning. Does it have a non-special-pleading variety?

munificent

5 days ago

Well, I didn't pick theatre and Photoshop as narrow things, the parent comment did.

I'm saying an artform that is meaningful to its participants and allows them to make a living wage while enriching the lives' of others should not be thoughtlessly discarded in slave to the almighty god of economic efficiency. It's not special pleading because I'd apply this to all artforms and all sorts of work that bring people dignity and joy.

I'm not a reactionary luddite saying that we should still be using oil streetlamps so we don't put the lamplighters out of work. But at the same time I don't think we should automatically and carelessly accept the decimation of human meaning and dignity at the altar of shareholder value.

pvg

4 days ago

I'm not a reactionary luddite saying that we should still be using oil streetlamps so we don't put the lamplighters out of work.

No doubt. A few years ago there was some HN post with a video of the completely preposterous process of making diagrams for Crafting Interpreters. I didn't particularly need the book nor do I have room for it but I bought it there and then to support the spirit of all-consuming wankery. So I'm not here from Mitch & Murray & Dark Satanic Mills, Inc either. At the same time, I'm not sold on the idea niche art is the source of human dignity that needs societal protection, not because I'm some ogre but because I'm not convinced that's how actual art actually arts or provides meaning or evolves.

Like another Thomas put it

Not for the proud man apart

From the raging moon I write

On these spindrift pages

Nor for the towering dead

With their nightingales and psalms

But for the lovers, their arms

Round the griefs of the ages,

Who pay no praise or wages

Nor heed my craft or art.

munificent

4 days ago

> the spirit of all-consuming wankery.

Haha, a good way to describe it. :)

> the idea niche art is the source of human dignity that needs societal protection

I mean... have you looked around at the world today? We've got pick at least some sources of human dignity to protect because there seem to be fewer and fewer left.

BobbyJo

5 days ago

Sitting in a moving car and sitting on a moving horse are both called "riding", but I think we can all appreciate how useless it is to equate the two.

pvg

5 days ago

They aren't, broadly speaking, interesting forms of expression so the fact you can draw some trivial string match analogy doesn't seem worth much discussion.

BobbyJo

4 days ago

That was my point. The fact that we call people wearing CGI suits hopping around green room, and people on stage playing a character for a crowd, both acting doesn't account for the fact that doing one doesn't mean you can do the other.

throw310822

5 days ago

> Did Photoshop kill graphic artists?

No, but AI did.

rerdavies

5 days ago

In actual fact, photoshop did kill graphic arts. There was an entire industry filled with people who had highly-developed skillsets that suddenly became obsolete. Painters for example. Before photoshop, I had to go out of house to get artwork done; now I just do it myself.

hmcq6

5 days ago

No, it didn’t.

It changed the skill set but it didn’t “kill the graphic arts”

Rotoscoping in photoshop is rotoscoping. Superimposing an image on another in photoshop is the same as with film, it’s just faster and cheaper to try again. Digital painting is painting.

AI doesn’t require an artist to make “art”. It doesn’t require skill. It’s different than other tools

hiddenfinance

5 days ago

Even worse!!! What is consider art work now days are whatever that can be made on some vector based program. This really also stifles creativities, pigeonholing what is consider creative or art work into something can be used for machine learning.

Whatever can be replaced by AI will, cause it is easier for business people to deal with than real people.

hmcq6

5 days ago

Most of the vector art I see is minimalism. I can’t see this as anything but an argument that minimalism “stifles creativity”

> vector art pigeonholes art into something that can be used for machine learning

Look around, AI companies are doing just fine with raster art.

The only thing we agree on is that this will hurt workers

tptacek

5 days ago

This, as the article makes clear, is a concern I am alert and receptive to. Ban production of anything visual from an LLM; I'll vote for it. Just make sure they can still generate Mermaid charts and Graphviz diagrams, so they still apply to developers.

hatefulmoron

5 days ago

What is unique about graphic design that warrants such extraordinary care? Should we just ban technology that approaches "replacement" territory? What about the people, real or imagined, that earn a living making Graphviz diagrams?

omnimus

5 days ago

It’s more question of how it does what it does. By making statistical model out of work of humans that it now aims to replace.

I think graphic designers would be a lot less angry if AIs were trained on licensed work… thats how the system worked up until now after all.

fennecfoxy

5 days ago

I don't think most artists would be any less angry & scared if AI was trained on licensed work. The rhetoric would just shift from mostly "they're breashing copyright!" to more of the "machine art is soulless and lacks true human creativity!" line.

I have a lot of artist friends but I still appreciate that diffusion models are (and will be with further refinement) incredibly useful tools.

What we're seeing is just the commoditisation of an industry in the same way that we have many, many times before through the industrial era, etc.

omnimus

5 days ago

It actually doesn't matter how would they feel. In currently accepted copyright framework if the works were licensed they couldn't do much about it. But right now they can be upset because suddenly new normal is massive copyright violation. It's very clear that without the massive amount of unlicensed work the LLMs simply wouldn't work well. The AI industry is just trying to run with it hoping nobody will notice.

Amezarak

5 days ago

It isn’t clear at all that there’s any infringement going on at all, except in cases where AI output reproduces copyrighted content or content that is sufficiently close to copyrighted content to constitute a derivative work. For example, if you told an LLM to write a Harry Potter fanfic, that would be infringement - fanfics are actually infringing derivative works that usually get a pass because nobody wants to sue their fanbase.

It’s very unlikely simply training an LLM on “unlicensed” work constitutes infringement. It could possibly be that the model itself, when published, would represent a derivative work, but it’s unlikely that most output would be unless specifically prompted to be.

omnimus

5 days ago

I am not sure why you would think so. AFAIK we will see more what courts think later in 2025 but judging from what was ruled in Delaware in feb... it is actually very likely that LLMs use of material is not "fair use" because besides "how transformed" work is one important part of "fair use" is that the output does not compete with the initial work. LLMs not only compete... they are specifically sold as replacement of the work they have been trained on.

This is why all the lobby now pushes the govs to not allow any regulation of AI even if courts disagree.

IMHO what will happen anyway is that at some point the companies will "solve" the licensing by training models purely on older synthetic LLM output that will be "public research" (which of course will have the "human" weights but they will claim it doesnt matter).

Amezarak

4 days ago

What you are describing is the output of the LLM, not the model. Can you link to the case where a model itself was determined to be infringing?

It’s important that copyright applies to copying/publishing/distributing - you can do whatever you to copyrighted works by yourself.

omnimus

4 days ago

I dont follow. The artists are obviously complaining about the output that LLMs create. If you create LLM and dont use it then yeah nobody would have problem with it because nobody would know about it…

Amezarak

4 days ago

In that case, public services can continue to try to fine tune outputs to not generate anything infringing. They can train on any material they want.

Of course, that still won’t make artists happy, because they think things like styles can be copyrighted, which isn’t true.

omnimus

4 days ago

Any LLM output created with unlicensed sources is tainted. It doesnt matter if the output does not look like anything in the dataset. If you take out the unlicensed sources then you simply wont get the same result. An since the results directly compete with the source then its not “fair use”.

If we believe that authors should be able decide how their work is used then they can for sure say no machine learning. If we dont believe in intelectual property then anything is for grabs. I am ok with it but the corps are not.

Amezarak

3 days ago

That’s not how copyright law works, but it might be how it should work.

nogridbag

5 days ago

I'm interpreting what you described as a derivative work to be something like:

"Create a video of a girl running through a field in the style of Studio Ghibli."

There, someone has specifically prompted the AI to create something visually similar to X.

But would you still consider it a derivative work if you replaced the words "Studio Ghibli" with a few sentences describing their style that ultimately produces the same output?

Amezarak

4 days ago

Derivative work is a legal term. Art styles cannot be copyrighted.

hatefulmoron

5 days ago

I get where you're coming from, but given that LLMs are trained on every available written word regardless of license, there's no meaningful distinction. Companies training LLMs for programming and writing show the same disregard for copyright as they do for graphic design. Therefore, graphic designers aren't owed special consideration that the author is unwilling to extend to anybody else.

omnimus

5 days ago

Of course i think the same about text, code, sound or any other LLMs output. The author is wrong if they are unwilling to give same measure to everything. The fact this is new normal now for everything does not make it right.

ff317

4 days ago

I like this argument, but it does somewhat apply to software development as well! The only real difference is that the bulk of the "licensed work" the LLMs are consuming to learn to generate code happened to use some open source license that didn't specifically exclude use of the code as training data for an AI.

For some of the free-er licenses this might mostly be just a lack-of-attribution issue, but in the case of some stronger licenses like GPL/AGPL, I'd argue that training a commercial AI codegen tool (which is then used to generate commercial closed-source code) on licensed code is against the spirit of the license, even if it's not against the letter of the license (probably mostly because the license authors didn't predict this future we live in).

samcat116

5 days ago

FWIW Adobe makes a lot of noise about how their specific models were indeed trained on only licensed work. Not sure if that really matters however

omnimus

4 days ago

Yes Adobe and Shutterstock/Getty might be in position to do this.

But there is a reason why nobody cares about Adobe AI and everybody uses midjourney…

tptacek

5 days ago

The article discusses this.

hatefulmoron

5 days ago

Does it? It admits at the top that art is special for no given reason, then it claims that programmers don't care about copyright and they deserve what's coming to them, or something..

"Artificial intelligence is profoundly — and probably unfairly — threatening to visual artists"

This feels asserted without any real evidence

tptacek

5 days ago

LLMs immediately and completely displace the bread-and-butter replacement-tier illustration and design work that makes up much of that profession, and does so by effectively counterfeiting creative expression. An coding agent writes a SQL join or a tree traversal. The two things are not the same.

Far more importantly, though, artists haven't spent the last quarter century working to eliminate protections for IPR. Software developers have.

Finally, though I'm not stuck on this: I simply don't agree with the case being made for LLMs violating IPR.

I have had the pleasure, many times over the last 16 years, of expressing my discomfort with nerd piracy culture and the coercive might-makes-right arguments underpinning it. I know how the argument goes over here (like a lead balloon). You can agree with me or disagree. But I've earned my bona fides here. The search bar will avail.

fennecbutt

5 days ago

>bread-and-butter replacement-tier

How is creative expression required for such things?

Also, I believe that we're just monkey meat bags and not magical beings and so the whole human creativity thing can easily be reproduced with enough data + a sprinkle of randomness. This is why you see trends in supposedly thought provoking art across many artists.

Artists draw from imagination which is drawn from lived experience and most humans have roughly the same lives on average, cultural/country barriers probably produce more of a difference.

Many of the flourishes any artist may use in their work is also likely used by many other artists.

If I commission "draw a mad scientist, use creative license" from several human artists I'm telling you now that they'll all mostly look the same.

thanksgiving

5 days ago

> Far more importantly, though, artists haven't spent the last quarter century working to eliminate protections for IPR. Software developers have.

I think the case we are making is there is no such thing as intellectual property to begin with and the whole thing is a scam created by duck taping a bunch of different concepts together when they should not be grouped together at all.

https://www.gnu.org/philosophy/not-ipr.en.html

rfrey

5 days ago

That's exactly the point, it's hard to see how someone could hold that view and pillory AI companies for slurping up proprietary code.

You probably don't have those views. But I think Thomas' point is that the profession as a whole has been crying "information wants to be free" for so many years, when what they meant was "information I don't want to pay for wants to be free" - and the hostile response to AI training on private data underlines that.

Jensson

5 days ago

> LLMs immediately and completely displace the bread-and-butter replacement-tier illustration and design work that makes up much of that profession, and does so by effectively counterfeiting creative expression. An coding agent writes a SQL join or a tree traversal. The two things are not the same.

In what way are these two not the same? It isn't like icons or ui panels are more original than the code that runs the app.

Or are you saying only artists are creating things of value and it is fine to steal all the work of programmers?

oompty

5 days ago

What about ones trained on fully licensed art, like Adobe Firefly (based on their own stock library) or F-Lite by Freepik & Fal (also claimed to be copyright safe)?

hatefulmoron

5 days ago

> LLMs immediately and completely displace the bread-and-butter replacement-tier illustration and design work that makes up much of that profession

And so what? Tell it to the Graphviz diagram creators, entry level Javascript programmers, horse carriage drivers, etc. What's special?

> .. and does so by effectively counterfeiting creative expression

What does this actually mean, though? ChatGPT isn't claiming to have "creative expression" in this sense. Everybody knows that it's generating an image using mathematics executed on a GPU. It's creating images. Like an LLM creates text. It creates artwork in the same sense that it creates novels.

> Far more importantly, though, artists haven't spent the last quarter century working to eliminate protections for IPR. Software developers have.

Programmers are very particular about licenses in opposition to your theory. Copyleft licensing leans heavily on enforcing copyright. Besides, I hear artists complain about the duration of copyright frequently. Pointing to some subset of programmers that are against IPR is just nutpicking in any case.

tptacek

5 days ago

Oh, for sure. Programmers are very particular about licenses. For code.

hatefulmoron

5 days ago

I get it, you have an axe to grind against some subset of programmers who are "nerds" in a "piracy culture". Artists don't deserve special protections. It sucks for your family members, I really mean that, but they will have to adapt with everybody else.

mwcampbell

5 days ago

I disagree with you on this. Artists, writers, and programmers deserve equal protection, and this means that tptacek is right to criticize nerd piracy culture. In other words, we programmers should respect artists and writers too.

hatefulmoron

5 days ago

To be clear, we're not in disagreement. We should all respect each other. However, it's pretty clear that the cat's out of the bag, and trying to claw back protections for only one group of people is stupid. It really betrays the author's own biases.

victorbjorklund

5 days ago

Doubt it is the same people. I doubt anyone argues that paintings deserve no protection while code does it.

tptacek

5 days ago

I do have an axe to grind, and that part of the post is axe-grindy (though: it sincerely informs how I think about LLMs), I knew that going into it (unanimous feedback from reviewers!) and I own it.

marcusb

4 days ago

I generally agree with your post. Many of the arguments against LLMs being thrown around are unserious, unsound, and a made-for-social-media circle jerk that don't survive any serious adversarial scrutiny.

That said, this particular argument you are advancing isn't getting so much heat here because of an unfriendly audience that just doesn't want to hear what you have to say. Or that is defensive because of hypocrisy and past copyright transgressions. It is being torn apart because this argument that artists deserve protection, but software engineers don't is unsound special pleading of the kind you criticize in your post.

Firstly, the idea that programmers are uniquely hypocritical about IPR is hyperbole unsupported by any evidence you've offered. It is little more than a vibe. As I recall, when Photoshop was sold with a perpetual license, it was widely pirated. By artists.

Secondly, the idea -- that you dance around but don't state outright -- that programmers should be singled out for punishment since "we" put others out of work is absurd and naive. "We" didn't do that. It isn't the capital owners over at Travelocity that are going to pay the price for LLM displacement of software engineers, it is the junior engineer making $140k/year with a mortgage.

Thirdly, if you don't buy into LLM usage as violating IPR, then what exactly is your argument against LLM use for the arts? Just a policy edict that thou shalt not use LLMs to create images because it puts some working artists out of business? Is there a threshold of job destruction that has to occur for you to think we should ban LLMs use case by use case? Are there any other outlaws/scarlet-letter-bearers in addition to programmers that will never receive any policy protection in this area because of real or perceived past transgressions?

tptacek

4 days ago

Adobe is one of the most successful corporations in the history of commerce; the piracy technologists enabled wrecked most media industries.

Again, the argument I'm making regarding artists is that LLMs are counterfeiting human art. I don't accept the premise that structurally identical solutions in software counterfeit their originals.

marcusb

4 days ago

> Adobe is one of the most successful corporations in the history of commerce; the piracy technologists enabled wrecked most media industries.

I guess that makes it ok then for artists to pirate Adobe's product. Also, I live in a music industry hub -- Nashville -- you'll have to forgive me if I don't take RIAA at their word that the music industry is in shambles, what with my lying eyes and all.

> Again, the argument I'm making regarding artists is that LLMs are counterfeiting human art. I don't accept the premise that structurally identical solutions in software counterfeit their originals.

I'm aware of the argument you are making. I imagine most of the people here understand the argument you are making. Its just a really asinine argument and is propped up by all manner of special pleading (but art is different, programmers are all naughty pirates that deserve to be punished) and appeals to authority (check my post history - I've established my bona fides.)

There simply is no serious argument to be made that LLMs reproducing one work product and displacing labor is better or worse than an LLM reproducing a different work product and displacing labor. Nobody is going to display some ad graphic from the local botanical garden's flyer for their spring gala at The Met. That's what is getting displaced by LLM. Banksy isn't being put out of business by stable diffusion. The person making the ad for the botanical garden's flyer has market value because they know how to draw things that people like to see in ads. A programmer has value because they know how to write software that a business is willing to pay for. It is as elitist as it is incoherent to say that one person's work product deserves to be protected but another person's does not because of "creativity."

Your argument holds no more water and deserves to be taken no more seriously than some knucklehead on Mastodon or Bluesky harping about how LLMs are going to cause global warming to triple and that no output LLMs produce has any value.

tptacek

3 days ago

Well, I disagree with you. For the nth time, though, I also don't grant the premise that LLMs are violative of the IPR of programmers. But more importantly than anything else, I just don't want to hear any of this from developers. That's not "your arguments are wrong and I have refuted them". It's "I'm not going to hear them from you".

marcusb

3 days ago

> For the nth time, though, I also don't grant the premise that LLMs are violative of the IPR of programmers.

I wish you all the best waiting for a future where the legislature and courts decide that LLM output is violative of copyright law only in the visual arts.

> I just don't want to hear any of this from developers.

Well, you seem to have posted about the wrong topic in the wrong forum then. But you’ve heard what you’ve wanted to hear in the discussion related to this post, so maybe that doesn’t really matter.

ivape

5 days ago

counterfeiting creative expression

This is the only piece of human work left in the long run, and that’s providing training data on taste. Once we hook up a/b testing on ai creative outputs, the LLM will know how to be creative and not just duplicative. The ai will never have innate taste, but we can feed it taste.

We can also starve it of taste, but that’s impossible because humans can’t stop providing data. In other words, never tell the LLM what looks good and it will never know. A human in the most isolated part of the world can discern what creation is beautiful and what is not.

fennecbutt

5 days ago

Everything is derivative, even all human work. I don't think "creativity" is that hard to replicate, for humans it's about lived experience. For a model it would need the data that impacts its decisions. Atm models are trained for a neutral/overall result.

hmcq6

5 days ago

Your premise is an axiom that I don’t think most would accept.

Is the matrix a ripoff of the Truman show? Is Oldboy derivative of Oedipus?

Saying everything is derivative is reductive.

GoblinSlayer

5 days ago

Modern flat graphic style has basically zero quality, I drew one myself even though I'm absolutely incompetent in proper drawing.

palmfacehn

5 days ago

>This feels asserted without any real evidence

Things like this are expressions of preference. The discussion will typically devolve into restatements of the original preference and appeals to special circumstances.

speleding

5 days ago

Hasn't that ship sailed? How would any type of ban work when the user can just redirect the banned query to a model in a different jurisdiction, for example, Deepseek? I don't think this genie is going back into the bottle, we're going to have to learn to live with it.

victorbjorklund

5 days ago

Why not the same for texts? Why are shitty visual art more worth than the best texts from beloved authors? And what about cooking robots? Should we not protect the culinary arts?

throw310822

5 days ago

> Ban production of anything visual from an LLM

That's a bit beside the point, which is that AI will not be just another tool, it will take ALL the jobs, one after another.

I do agree it's absolutely great though, and being against it is dumb, unless you want to actually ban it- which is impossible.

GoblinSlayer

5 days ago

On the other hand it can revive dead artists. How about AI generated content going gpl in 100 days after release?

Hoasi

5 days ago

Well, this is only partially true. My optimistic take is that it will redefine the field. There is still a future for resourceful, attentive, and prepared graphic artists.

ttyyzz

5 days ago

AI didn't kill creativity nor intuition. It much rather lack's those things completely. Artists can make use of AI but they can't make themselves obsolete just yet.

rvnx

5 days ago

With AI anyone can be an artist, and this is a good thing.

Sohcahtoa82

5 days ago

Prompting Midjourney or ChatGPT to make an image does not make you an artist.

python-b5

5 days ago

Using AI makes you an artist about as much as commissioning someone else to make art for you does. Sure, you provided the description of what needed to be done, and likely gave some input along the way, but the real work was done by someone else. There are faster iteration times with AI, but you are still not the one making the art. That is what differentiates generative models from other kinds of tools.

iszomer

2 days ago

Imagine when the commissioned artist uses AI themselves but this goes deep down the rabbit hole of who gets the spread on potential attribution of said "work".

hmcq6

5 days ago

AI can’t make anyone a painter. It can generate a digital painting for you but it can’t give you the skills to transfer an image from your mind into the real world.

AI currently can’t reliably make 3d objects so AI can’t make you a sculptor.

Flemlo

5 days ago

We now have wall printers based on UV paint.

3D models can be generated quite well already. Good enough for a sculpture.

throw310822

5 days ago

> AI didn't kill creativity nor intuition. It much rather lack's those things completely

Quite the opposite, I'd say that it's what it has most. What are "hallucinations" if not just a display of immense creativity and intuition? "Here, I'll make up this API call that's I haven't read about anywhere but sounds right".

ttyyzz

5 days ago

I disagree. AI is good at pattern recognition, but still struggles to grasp causual relationships. These Made-up api calls are just a pattern in the large data set. Dont confuse it with creativity.

throw310822

5 days ago

I would definitely confuse that with "intuition"- which I would describe it as seeing and using weak, unstated relationships, aka patterns. That's my intuition, at least.

As to creativity, that's something I know too little about to define it, but it seems reasonable that it's even more "fuzzy" than intuition. On the opposite, causal relationships are closer to hard logic, which is what LLMs struggle with- as humans do, too.

MrScruff

5 days ago

A lot of art is about pattern recognition. We represent or infer objects or ideas through some indirection or abstraction. The viewer or listener's brain (depending on their level of sophistication) fills in the gaps, and the greater the level of indirection (or complexity of pattern recognition required) the greater the emotional payoff. This also applies to humour.

ZaoLahma

5 days ago

It will not.

I'm an engineer through and through. I can ask an LLM to generate images just fine, but for a given target audience for a certain purpose? I would have no clue. None what so ever. Ask me to generate an image to use in advertisement for Nuka Cola, targeting tired parents? I genuinely have no idea of where to even start. I have absolutely no understanding of the advertisement domain, and I don't know what tired parents find visually pleasing, or what they would "vibe" with.

My feeble attempts would be absolute trash compared to a professional artist who uses AI to express their vision. The artist would be able to prompt so much more effectively and correct the things that they know from experience will not work.

It's the exact same as with coding with an AI - it will be trash unless you understand the hows and the whys.

throw310822

5 days ago

> Ask me to generate an image to use in advertisement for Nuka Cola, targeting tired parents? I genuinely have no idea of where to even start.

I believe you, did you try asking ChatGPT or Claude though?

You can ask them a list of highest-level themes and requirements and further refine from there.

fennecbutt

5 days ago

Have you seen modern advertisements lmao? Most of the time the ad has nothing to do with the actual product, it's an absolute shitshow.

Although I've seen a little American TV ads before, that shit's basically radioactively coloured, same as your fizzy drinks.

didibus

5 days ago

I agree with the potential of AI. I use it daily for coding and other tasks. However, there are two fundamental issues that make this different from the Photoshop comparison.

The models are trained primarily on copyrighted material and code written by the very professionals who now must "upskill" to remain relevant. This raises complex questions about compensation and ownership that didn't exist with traditional tools. Even if current laws permit it, the ethical implications are different from Photoshop-like tools.

Previous innovations created new mediums and opportunities. Photoshop didn't replace artists, because it enabled new art forms. Film reduced theater jobs but created an entirely new industry where skills could mostly transfer. Manufacturing automation made products like cars accessible to everyone.

AI is fundamentally different. It's designed to produce identical output to human workers, just more cheaply and/or faster. Instead of creating new possibilities, it's primarily focused on substitution. Say AI could eliminate 20% of coding jobs and reduce wages by 30%:

    * Unlike previous innovations, this won't make software more accessible
    * Software already scales essentially for free (build once, used by many)
    * Most consumer software is already free (ad-supported)

The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.

This isn't an argument against progress, but we should be clear-eyed about how this transition differs from historical parallels, and why it might not repeat the same historical outcomes. I'm not claiming this will be the case, but that you can see some pretty significant differences for why you might be skeptical that the same creation of new jobs, or improvement to human lifestyle/capabilities will emerge as with say Film or Photoshop.

AI can also be used to achieve things we could not do without, that's the good use of AI, things like Cancer detection, self-driving cars, and so on. I'm speaking specifically of the use of AI to automate and reduce the cost/speed of white collar work like software development.

throw234234234

4 days ago

For me this is the "issue" I have with AI. Unlike say the internet, mobile and other tech revolutions where I could see new use cases or existing use case optimisation spring up all the time (new apps, new ways of interacting, more efficient than physical systems, etc) AI seems to be focused more on efficiency/substitution of labour than pushing the frontier on "quality of life". Maybe this will change but the buzz is around job replacement atm.

Its why it is impacting so many people, but also having very small changes to everyday "quality of life" kind of metrics (e.g. ability to eat, communicate, live somewhere, etc). It arguably is more about enabling greater inequality and gatekeeping of wealth to capital - where intelligence and merit matters less in the future world. For most people its hard to see where the positives are for them long term in this story; most everyday folks don't believe the utopia story is in anyway probable.

nmgycombinator

2 days ago

> The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.

This is the thing that worries me the most about AI.

The author's ramblings dovetails with this a bit in their "but the craft" section. They vaguely attack the idea of code-golfing and focusing on coding for the craft as essentially incompatible with the corporate model of programming work. And perhaps they're right. If they are, though, this AI wave/hype being mostly about process-streamlining and such seems to be a distillation of that fact.

GoblinSlayer

4 days ago

Maybe it's like automation that makes webdev accessible to anyone. You take a week long AI coaching course and talk to an AI and let it throw together a website in an hour, then you self host it.

whazor

5 days ago

The key is that manual coding for a normal task takes a one/two weeks, where-as if you configure all your prompts/agents correctly you could do it in a couple of hours. As you highlighted, it brings many new issues (code quality, lack of tests, tech debt) and you need to carefully create prompts and review the code to tackle those. But in the end, you can save significant time.

mdavid626

5 days ago

I disagree. I think this notion comes from the idea that creating software is about coding. Automating/improving coding => you have software at the end.

This might be how one looks at it in the beginning, when having no experience or no idea about coding. With time one will realize it's more about creating the correct mental model of the problem at hand, rather than the activity of coding itself.

Once this realized, AI can't "save" you days of work, as coding is the least time consuming part of creating software.

rerdavies

5 days ago

The actual most time-consuming parts of creating software (I think) is reading documentation for the APIs and libraries you're using. Probably the biggest productivity boost I get from my coding assistant is attributable to that.

e.g: MUI, typescript:

   // make the checkbox label appear before the checkbox.

Tab. Done. Delete the comment.

vs. about 2 minutes wading through the perfectly excellent but very verbose online documentation to find that I need to set the "labelPlacement" attribute to "start".

Or the tedious minutia that I am perfectly capable of doing, but it's time consuming and error-prone:

    // execute a SQL update

Tab tab tab tab .... Done, with all bindings and fields done, based on the structure that's passed as a parameter to the method, and the tables and fieldnames that were created in source code above the current line. (love that one).

ACS_Solver

5 days ago

Yes, I currently lean skeptical but agentic LLMs excel at this sort of task. I had a great use just yesterday.

I have an older Mediawiki install that's been overrun by spam. It's on a server I have root access on. With Claude, I was able to rapidly get some Python scripts that work against the wiki database directly and can clean spam in various ways, by article ID, title regex, certain other patterns. Then I wanted to delete all spam users - defined here as users registered after a certain date whose only edit is to their own user page - and Claude made a script for that very quickly. It even deployed with scp when I told it where to.

Looking at the SQL that ended up in the code, there's non-obvious things such as user pages being pages where page_namespace = 2. The query involves the user, page, actor and revision tables. I checked afterwards, MediaWiki has good documentation for its database tables. Sure, I could have written the SQL myself based on that documentation, but certainly not have the query wrapped in Python and ready to run in under a minute.

andybp85

5 days ago

what are you using for this? one thing I can't wrap my head around is how anyone's idea of fun is poking at an LLM until it generates something possibly passable and then figuring what the hell it did and why, but this sounds like something i'd actually use.

rerdavies

5 days ago

Yes. Vscode/Copilot/Claude Sonnet 4. The choice of AI may make a significant difference. It used to. The GPT AIs, particularly, were useless. I haven't tried GPT 4.1 yet.

calvinmorrison

5 days ago

vscode?

andybp85

5 days ago

vscode comes with that out of the box?

calvinmorrison

5 days ago

pretty much, the plugin is called Copilot. click a button and install it

andybp85

5 days ago

that's a whole different piece of software that it doesn't come with out of the box lol.

Copilot was what i was looking for, thank you. I have it installed in Webstorm already but I haven't messed with this side of it.

calvinmorrison

5 days ago

idk i click 'copilot' and it adds it. it took maybe 2 minutes.

rerdavies

5 days ago

And if you don't add it, VSCode nags you incessantly until you do. :-P

01100011

5 days ago

I think for some folks their job really is just about coding. For me that was rarely true. I've written very little code in my career. Mostly design work, analyzing broken code, making targeted fixes...

I think these days coding is 20% of my job, maybe less. But HN is a diverse audience. You have the full range of web programmers and data scientists all the way to systems engineers and people writing for bare metal. Someone cranking out one-off Python and Javascript is going to have a different opinion on AI coding vs a C/C++ systems engineer and they're going to yell at each other in comments until they realize they don't have the same job, the same goals or the same experiences.

drited

5 days ago

Would you have any standard prompts you could share which ask it to make a draft with you'd want (eg unit tests etc)?

rerdavies

5 days ago

    C++, Linux: write an audio processing loop for ALSA    
    reading audio input, processing it, and then outputting
    audio on ALSA devices. Include code to open and close
    the ALSA devices. Wrap the code up in a class. Use 
    Camelcase naming for C++ methods.
    Skip the explanations.

``` Run it through grok:

    https://grok.com/

When I ACTUALLY wrote that code the first time, it took me about two weeks to get it right. (horrifying documentation set, with inadequate sample code).

Typically, I'll edit code like this from top to bottom in order to get it to conform to my preferred coding idioms. And I will, of course, submit the code to the same sort of review that I would give my own first-cut code. And the way initialization parameters are passed in needs work. (A follow-on prompt would probably fix that). This is not a fire and forget sort of activity. Hard to say whether that code is right or not; but even if it's not, it would have saved me at least 12 days of effort.

Why did I choose that prompt? Because I have learned through use that AIs do will well with these sorts of coding tasks. I'm still learning, and making new discoveries every day. Today's discovery: it is SO easy to implement SQLLite database in C++ using an AI when you go at it the right way!

skydhash

5 days ago

That rely heavily on your mental model of ALSA to write a prompt like that. For example, I believe macOS audio stack is node based like pipewire. For someone who is knowledgeable about the domain, it's easy enough to get some base output to review and iterate upon. Especially if there was enough training data or you constrain the output with the context. So there's no actual time saving because you have to take in account the time you spent learning about the domain.

That is why some people don't find AI that essential, if you have the knowledge, you already know how to find a specific part in the documentation to refresh your semantics and the time saved is minuscule.

rerdavies

5 days ago

Fer goodness sake. Eyeroll.

   Write an audio processing loop for pipewire. Wrap the code up in a 
   C++ class. Read audio data, process it and output through an output 
   port. Skip the explanations. Use CamelCase names for methods.
   Bundle all the configuration options up into a single
   structure.

Run it through grok. I'd actually use VSCode Copilot Claude Sonnet 4. Grok is being used so that people who do not have access to a coding AI can see what they would get if they did.

I'd use that code as a starting point despite having zero knowledge of pipewire. And probably fill in other bits using AI as the need arises. "Read the audio data, process it, output it" is hardly deep domain knowledge.

skydhash

5 days ago

Results with gemini

https://pastebin.com/6b4yhfYw

A 5 second search on DDG ("easyeffects") and a 10 second navigation on github.

https://github.com/wwmm/easyeffects/blob/master/src/plugin_b...

But that is GPL 3.0 and a lot of people want to use the license laundering LLM machine.

N.B. I already know about easyeffects from when I was seeking for a software equalizer

EDIT

Another 30 seconds exploration ("pipewire" on DDG, finding the main site, then goes on the documentation page, and the tutorial section).

https://docs.pipewire.org/audio-dsp-filter_8c-example.html

There's a lot of way to find truthful information without playing Russian roulette with an LLM.

hiddenfinance

5 days ago

The question is can I self-host this "mech suit"? If not, I would much not use some API hosted by another party.

Saas just seems very much like a terminator seed situation in the end.

dogcomplex

5 days ago

"Mech suit" is apt. Gonna use that now.

Having plenty of initial discussion and distilling that into requirements documents aimed for modularized components which can all be easily tackled separately is key.

Jordanpomeroy

5 days ago

This is my experience as well.

I’d add that Excel didn’t kill the engineering field. It made them more effective and maybe companies will need less of them. But it also means more startups and smaller shops can make use of an engineer. The change is hard and an equilibrium will be reached.

H1Supreme

3 days ago

> Did Photoshop kill graphic artists?

Desktop publication software killed many jobs. I worked for a publication where I had colleagues that used to typeset, place images, and use a camera to build pages by hand. That required a team of people. Once Quark Xpress and the like hit the scene, one person could do it all, faster.

In terms of illustration, the tools moved from pen and paper to Adobe Illustrator and Aldus / Macromedia Freehand. Which I'd argue was more of a sideways move. You still needed an illustrators skillset to use these tools.

The difference between what I just described and LLM image generation is the tooling changed to streamline an existing skillset. LLM's replace all of it. Just type something and here's your picture. No art / design skill necessary. Obviously, there's no guarantee that the LLM generated image will be any good. So, I'm not sure the Photoshop analogy works here.

giancarlostoro

5 days ago

> Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.

This is how I use it mostly. I also use it for boilerplate, like "What would a database model look like that handles the following" you never want it to do everything, though there are tools that can and will and they're impressive, but then when you have a true production issue, your inability to quickly respond will be a barrier.

nashashmi

5 days ago

That’s all great news that if you know how to use an LLM, it works wonders for you. But LLMs are changing so fast, can it really be sustainable for me to “learn” it only for it to change and go backwards the next month? (I am thinking about how terrible Google became.)

conradev

5 days ago

I’m learning live how to use these things better, and I haven’t seen practical guides like:

- Split things into small files, today’s model harnesses struggle with massive files

- Write lots of tests. When the language model messes up the code (it will), it can use the tests to climb out. Tests are the best way to communicate behavior.

- Write guides and documentation for complex tasks in complex codebases. Use a language model for the first pass if you’re too lazy. Useful for both humans and LLMs

It’s really: make your codebase welcoming for junior engineers

gs17

5 days ago

> it can use the tests to climb out

Or not. I watched Copilot's agent mode get stuck in a loop for most of an hour (to be fair, I was letting it continue to see how it handles this failure case) trying to make a test pass.

conradev

5 days ago

Yeah! When that happens I usually stop it and tap in a bigger model to “think” and get out of the loop (or fix it myself)

I’m impressed with this latest generation of models: they reward hack a lot less. Previously they’d change a failing unit test, but now they just look for reasonable but easy ways out in the code.

I call it reward hacking, and laziness is not the right word, but “knowing what needs to be done and not doing it” is the general issue here. I see it in junior engineers occasionally, too.

timcobb

5 days ago

> Did film kill theatre?

Relatively speaking, I would say that film and TV did kill theater

jlaternman

4 days ago

Yes! It needs and seems to want the human to be a deep collaborator. If you take that approach, it is actually a second senior developer you can work with. You need to push it, and explain the complexities in detail to get fuller rewards. And get it to document everything important it learns from each session's context. It wants to collaborate to make you a 10X coder, not to do your work for you while you laze. That is the biggest breakthrough I have found. They basically react like human brains, with the same kind of motives. Their output can vary dramatically based on the input you provide.

sim7c00

4 days ago

i love your views and way to express it, spot on. i feel similar in some ways. i hated ai, loved ai, hated it again and love it again. i still feel the code i unusable for my main problems, but i realize better its my arrogance that causes it. i cant formulate solutions eloquently enough and blame the AI for bad code.

AI has helped me pick up my pencil and paper again and realize my flawed knowledge, skills, and even flawed approach to AI.

Now i instructed it to never give me code :). not because the code is bad, but my attempts to extract code from it are more based in laziness than efficiency. they are easy to confuse afterall ;(....

I have tons of fun learning with AI, exploring. going on adventures into new topics. Then when i want to really do something, i try to use it for the things i know i am bad at due to laziness, not lack of knowledge. the thing i fell for first...

it helps me explore a space, then i think or am inspired for some creation, and it helps me structure and plan. when i ask it from laziness to give me the code, it helps me overcome my laziness by explaining what i need to do to be able to see why asking for the code was the wrong approach in the first place.

now, that might be different for you. but i have learned i am not some god tier hacker from the spawl, so i realized i need to learn and get better. perhaps you are at the level you can ask it for code and it just works. hats off in that case ;k (i do hope you tested well!)

richardw

5 days ago

Photoshop etc are still just tools. They can’t beat us at what has always set us apart: thinking. LLM’s are the closest, and while they’re not close they’re directionally correct. They’re general purpose, not like chess engines. And they improve. It’s hard to predict a year out, never mind ten.

MattGrommes

5 days ago

I agree, this article is basically what I've been thinking as I play with these things over time. They've gotten a ton better but the hot takes are still from 6-12 months ago.

One thing I wish he would have talked about though is maintenance. My only real qualm with my LLM agent buddy is the tendency to just keep adding code if the first pass didn't work. Eventually, it works, sometimes with my manual help. But the resulting code is harder to read and reason about, which makes maintenance and adding features or behavior changes harder. Until you're ready to just hand off the code to the LLM and not do your own changes to it, it's definitely something to keep in mind at minimum.

timeon

5 days ago

> Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”?

My obligatory comment how analogies are not good for arguments: there is already discussion here that film (etc.) may have killed theatre.

bytesandbots

4 days ago

I am pretty sure this comment is also AI generated. Just a guess but so many em-dash is suspicious. And the overall structure of convincing feels uncanny.

If this is true, can you share your initial draft that you asked the AI to rewrite. Am I not right that the initial draft is more concise and better conveys your actual thought, even though it's not as much convincing.

zedcombination

2 days ago

Definitely. So many people taken in by it!

caycep

5 days ago

I think also the key is - don't call it AI, because it's not. It's LLM assist query parsing and code generation. Semantically, if you call it AI, the public expects a cognitive equivalent to a human which this is not, and from what @tptacek describes, is not meant to be - the reasoning and other code bits to create agents and such seem to be developed specifically for code generation and programming assist and other tasks thereof. Viewed in that lens, the article is correct - it is by all means a major step forward.

digianarchist

5 days ago

I agree but that battle is lost. Someone was calling Zapier workflows AI on X.

bbarnett

5 days ago

AGI vs AI is how to separate this these days.

notindexed

5 days ago

The irony of the ChatGPT em dashes ;3

bytesandbots

4 days ago

The entire comment feels way too long, structured and convincing in a way that can only be written by an AI. I just hope that once the em-dashes are "fixed", we still be able to detect such text. I fear for a future when human text is sparse, even here at HN. It is depressing to see such a comment take the top spot.

volkk

3 days ago

Lol -- it even reads with the same exact tone as AI. For those that use it often, it's so easy to spot now. The luddites on HN that fear AI feel end up affected the most because they have no idea how to see it.

billy99k

5 days ago

I use LLMs daily. From helping me write technical reports (not 100%, mostly making things sound better after I have a first draft) to mapping APIs (documentation, etc).

I can only imagine what this technology will be like in 10 years. But I do know that it's not going anywhere and it's best to get familiar with it now.

taylodl

5 days ago

I treat AI as my digital partner in pair programming. I've learned how to give it specific and well-defined tasks to do, and it gets it done. The narrower the scope and more specific the task then the more successful you'll have.

jes5199

5 days ago

there’s a sweet spot in there, it’s not “as narrow as possible” - the most productive thing is to assign the largest possible tasks that are just short of the limit where the agents become stupid. this is hard to hit, and a moving target!

svachalek

5 days ago

Exactly. When you get a new tool or a new model, ask it for things the previous one failed at until you find the new ceiling.

brianjking

4 days ago

Love all of this.

Most importantly, I'll embrace the change and hope for the possible abundance.

beloch

5 days ago

LLM's are self-limiting, rather than self-reinforcing, and that's the big reason why they're not the thing, both good or bad, that some people think they are.

"Garbage in, garbage out", is still the rule for LLM's. If you don't spend billions training them or if you let them feed on their own tail too much they produce nonsense. e.g. Some LLM's currently produce better general search results than google. This is mainly a product of many billions being spent on expert trainers for those LLM's, while google neglects (or actively enshitifies) their search algorithms shamefully. It's humans, not LLM's, producing these results. How good will LLM's be at search once the money has moved somewhere else and neglect sets in?

LLM's aren't going to take everyone's jobs and trigger a singularity precisely because they fall apart if they try to feed on their own output. They need human input at every stage. They are going to take some people's jobs and create new ones for others, although it will probably be more of the former than the latter, or billionaires wouldn't be betting on them.

ljsprague

5 days ago

Yes, film killed theatre.

0points

5 days ago

> Then I actually read the code.

This is my experience in general. People seem to be impressed by the LLM output until they actually comprehend it.

The fastest way to have someone break out of this illusion is tell them to chat with the LLM about their own expertise. They will quickly start to notice errors in the output.

wiseowise

5 days ago

You know who does that also? Humans. I read shitty, broken, amazing, useful code every day, but you don’t see my complaining online that people who earn 100-200k salary don’t produce ideal output right away. And believe me, I spend way more time fixing their shit than LLMs.

If I can reduce this even by 10% for 20 dollars it’s a bargain.

ehutch79

5 days ago

But no one is hyping the fact that Bob the mediocre coder is going to replace us.

code_for_monkey

5 days ago

what no one is reckoning with right here

The AI skeptics are mostly correctly reacting to the AI hypists, who are usually shitty linkedin influencer type dudes crowing about how they never have to pay anyone again. its very natural, even intelligent to not trust this now that its filling the same bubble as NFTs a few years ago. I think its okay to stay skeptical and see where the chips fall in a few years at this point.

capiki

5 days ago

But Bob isn’t getting better every 6 months

ehutch79

5 days ago

If you’re not getting any better you are indeed in trouble.

barrell

5 days ago

I’ve definitely improved at a faster rate than LLMs over the last 6 months

capiki

5 days ago

Let’s see the evals

sksisoakanan

5 days ago

https://the-decoder.com/openai-quietly-funded-independent-ma...

You mean these?

I use AI everyday but you’ve got hundreds of billions of dollars and Scam Altman (known for having no morals and playing dirty) et al on “your” side. The only thing AI skeptics have is anecdotes and time. Having a principled argument isn’t really possible.

datadrivenangel

5 days ago

Offshoring / nearshoring has been here for decades!

wiseowise

5 days ago

tptacek

5 days ago

That has not been my experience at all with networking and cryptography.

jhanschoo

5 days ago

Your comment is ambiguous; what exactly do you refer to by "that"?

KoolKat23

5 days ago

That proves nothing with respect to the LLMs usefulness, all it means is that you are still useful.

cesarb

6 days ago

This article does not touch on the thing which worries me the most with respect to LLMs: the dependence.

Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work. Whoever controls that system can arbitrarily raise the prices, subtly manipulate the outputs, store and do anything they want with the inputs, or even suddenly cease to operate. And since, according to this article, only the latest and greatest LLM is acceptable (and I've seen that exact same argument six months ago), running locally is not viable (I've seen, in a recent discussion, someone mention a home server with something like 384G of RAM just to run one LLM locally).

To those of us who like Free Software because of the freedom it gives us, this is a severe regression.

aaron_m04

5 days ago

Yes, and it's even worse: if you think LLMs may possibly make the world a worse place, you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better.

MetaWhirledPeas

5 days ago

> you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better

This sounds a bit like bailing out the ocean.

aaron_m04

5 days ago

> This sounds a bit like bailing out the ocean.

If it's one individual doing this, sure. I am posting this in the hopes that others follow suit.

mrtesthah

5 days ago

so does voting.

inadequatespace

5 days ago

I think that’s a bit of a leap; if you think LLMs make the world a worse place, there are many actions that you might take or not take to try to address that.

aaron_m04

5 days ago

It's true that there could be other more impactful actions. I'd love to hear your thoughts on what else can be done.

inadequatespace

21 hours ago

To be fair, I don't have any great specific ideas, but "Work Without the Worker" for example talks about how a lot of LLMs are fueled by neo-colonialist exploitation.

So I guess broadly speaking there could be strategies involving attempting to influence governmental policy rather than by consumer choice.

Or more radically, trying to change the structure of the government in general such that the above influences actually are more tractable for the common person.

eleveriven

5 days ago

It's also why local models, even if less powerful, are so important. The gap between "state of the art" and "good enough for a lot of workflows" is narrowing fast

mplanchard

4 days ago

Yeah I am very excited for local models to get good enough to be properly useful. I’m a bit of an AI skeptic I’ll admit, but I’m much more of a SV venture-backed company skeptic. The idea of being heavily reliant on such a company, plus needing to be online, plus needing to pay money just to get some coding done is pretty unpalatable to me.

dabockster

5 days ago

Especially with MCP programs that can run in Docker containers.

dabockster

5 days ago

You can get 90%+ of the way there with a tiny “coder” LLM running on the Ollama backend with an extension like RooCode and a ton of MCP tools.

In fact, MCP is so ground breaking that I consider it to be the actual meat and potatoes of coding AIs. Large models are too monolithic, and knowledge is forever changing. Better just to use a small 14b model (or even 8b in some cases!) with some MCP search tools, a good knowledge graph for memory, and a decent front end for everything. Let it teach itself based on the current context.

And all of that can run on an off the shelf $1k gaming computer from Costco. It’ll be super slow compared to a cloud system (like HDD vs SSD levels of slowness), but it will run in the first place and you’ll get *something* out of it.

esaym

5 days ago

Why don't you elaborate on your setup then?

xandrius

5 days ago

Because you can look it easily up. Jan, gtp4all, etc.

It's not black magic anymore.

macrolime

5 days ago

Which MCPs do recommend?

FinalDestiny

3 days ago

DesktopCommander and Taskmaster are great to start with. With just these, you may start to see why OP recommends a memory MCP too (I don’t have a recommendation for that yet)

underdeserver

5 days ago

You can also make this argument to varying degrees about your internet connection, cloud provider, OS vendor, etc.

ang_cire

3 days ago

This is why I run a set of rackmount servers at home, that have the media and apps that I want to consume. If my ISP bites the dust tomorrow, I've literally got years worth of music, books, tv, movies, etc. Hell, I even have a bunch of models on ollama, and an offline copy of wikipedia running (minus media, obv) via kiwix.

It's not off-grid, but that's the eventual dream/ goal.

simoncion

5 days ago

I'm not the OP but:

* Not even counting cellular data carriers, I have a choice of at least five ISPs in my area. And if things get really bad, I can go down to my local library to politely encamp myself and use their WiFi.

* I've personally no need for a cloud provider, but I've spent a lot of time working on cloud-agnostic stuff. All the major cloud providers (and many of the minors) provide compute, storage (whether block, object, or relational), and network ingress and egress. As long as you don't deliberately tie yourself to the vendor-specific stuff, you're free to choose among all available providers.

* I run Linux. Enough said.

underdeserver

5 days ago

* You might have a choice of carriers or ISPs, but many don't.

* Hmm, what kind of software do you write that pays your bills?

* And your setup doesn't require any external infrastructure to be kept up to date?

simoncion

a day ago

> ...but many don't.

And many do. The US isn't the entire world, you know.

> ...what kind of software do you write that pays your bills?

B2B software that allows anyone to run their workloads with most any cloud provider, and most any on-prem "cloud". The entire point of this software is to abstract out the underlying infrastructure so that businesses can walk away from a particular vendor if that vendor gets too stroppy.

> ...your setup doesn't require any external infrastructure...

It's Gentoo Linux, so it runs largely on donated infra (and infra paid for with donations). But -unlike Windows or OS X users- if I get sick of what the Gentoo steering committee are doing, I can go to another distro (or just fucking roll my own should things get truly dire). That's the point of my comment.

apitman

5 days ago

How about your web browser?

Flemlo

5 days ago

Just this week a library got deprecated.

Open source of course.

So what's my response to that deprecating? Maintaining it myself? Nope finding another library.

You always depend on something...

edude03

5 days ago

> Maintaining it myself?

You say that like it's an absurd idea, but in fact this is what most companies would do.

Flemlo

5 days ago

I can maintain basic code no issue but not if it becomes complex or security relevant.

And I have worked in plenty of companies I'm the open source guy in these companies and me or my teams never had the capacity to do so

ku1ik

5 days ago

Well, you can’t really self-host your internet connection anyway :)

rollcat

5 days ago

Of course you can. It's called an AS (autonomous system), I think all you need is an IP address range, a physical link to someone willing to peer with you (another AS), some hardware, some paperwork, etc; and bam you're your own ISP.

My company has set this up for one of our customers (I wasn't involved).

computably

4 days ago

> all you need is an IP address range, a physical link to someone willing to peer with you (another AS), some hardware, some paperwork, etc; and bam you're your own ISP.

I'm pretty sure the connotation of "self-host" entails a strictly substantially smaller scope than starting your own ISP.

Finding someone willing to peer with you also defeats the purpose. You are still fundamentally dependent on established ISPs.

EFreethought

5 days ago

> You can also make this argument to varying degrees about your internet connection, cloud provider, OS vendor, etc.

True, but I think wanting to avoid yet another dependency is a good thing.

0x1ceb00da

5 days ago

... search engine

0j

5 days ago

I don't feel like being dependent on LLM coding tools is much of an issue, you can very easily switch between different vendors. And I hope that open weight models will be "good enough" until we get a monopoly. In any case, even if you are afraid of getting too dependent on AI tools, I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months

mplanchard

4 days ago

> I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months

This actually to me implies the opposite of what you’re saying here. Why bother relearning the state of the art every few months, versus waiting for things to stabilize on a set of easy-to-use tools?

rsanheim

5 days ago

We will have the equivalent of Claude Sonnet 4 in a local LLM that can run well on a modern Mac w/ 36+ GB of ram in a year or two. Maybe faster. The local/open models are developing very fast in terms of quantization and how well they can run on consumer hardware.

Folks that are local LLMs everyday now will probably say you can basically emulate at least Sonnet 3.7 for coding if you have an real AI workstation. Which may be true, but the time and effort and cost involved is substantial.

underdeserver

5 days ago

Good thing it's a competitive market with at least 5 serious, independent players.

nosianu

5 days ago

That will work until there has been a lot of infrastructure created to work with a particular player, and 3rd party software.

See the Microsoft ecosystem as an example. Nothing they do could not be replicated, but the network effects they achieved are strong. Too much glue, and 3rd party systems, and also training, and what users are used to, and what workers you could hire are used to, now all point to the MS ecosystem.

In this early mass-AI-use phase you still can easily switch vendors, sure. Just like in the 1980s you could still choose some other OS or office suite (like Star Office - the basis for OpenOffice, Lotus, WordStar, WordPerfect) without paying that kind of ecosystem cost, because it did not exist yet.

Today too much infrastructure and software relies on the systems from one particular company to change easily, even if the competition were able to provide a better piece of software in one area.

shaky-carrousel

5 days ago

Until they all merge, or form a cartel.

rpigab

5 days ago

Good thing it's funded by generous investors or groups who are okay with losing money on every sale (they'll make it up in volume), and never stop funding, and never raise prices, insert ads or enshittify.

rvnx

5 days ago

With the Mac Studio you get 512 GB of unified memory (shared between CPU and GPU), this is enough to run some exciting models.

In 20 years, memory has doubled 32x

It means that we could have 16 TB memory computers in 2045.

It can unlock a lot of possibilities. If even 1 TB is not enough by then (better architecture, more compact representation of data, etc).

fennecbutt

5 days ago

Yeah, for £10,000. And you get 512GB of bandwidth starved memory.

Still, I suppose that's better than what nvidia has on offer atm (even if a rack of gpus gives you much, much higher memory throughput).

theshrike79

5 days ago

AKCSHUALLY the M-series CPU memory upgrades are expensive because the memory is on-chip and the bandwidth is a lot bigger than on comparable PC hardware.

In some cases it's more cost effective to get M-series Mac Minis vs nVidia GPUs

lolinder

5 days ago

They know that, but all accounts I've read acknowledge that the unified memory is worse than dedicated VRAM. It's just much better than running LLMs on CPU and the only way for a regular consumer to get to 64GB+ of graphical memory.

hu3

5 days ago

It's still magnitude slower than AI GPUs.

And with $10k I could pay 40 years of Claude subscription. A much smarter and faster model.

onemoresoop

5 days ago

Claude subscription is $20 but is that all you're using? If you want to get the best of Claude it'll cost you a few hundred+ a month.

hajile

4 days ago

Memory scaling has all but stopped. Current RAM cells are made up of just 40,000 or so electrons (that's when it's first stored. It degrades from there until refreshed). Going smaller is almost impossible due to physics, noise, and the problem of needing to amplify that tiny charge to something usable.

For the past few years, we've been "getting smaller" by getting deeper. The diameter of the cell shrinks, but the depth of the cell goes up. As you can imagine, that doesn't scale very well. Cutting the cylinder diameter in half doubles the depth of the cylinder for the same volume.

If you try to put the cells closer together, you start to get quantum tunneling where electrons would disappear from one cell and appear in another cell altering charges in unexpected ways.

The times of massive memory shrinks are over. That means we have to reduce production costs and have more chips per computer or find a new kind of memory that is mass producible.

Hilift

5 days ago

That's going full speed ahead though. Every major cloud provider has an AI offering, and there are now multiple AI-centric cloud providers. There is a lot of money and speculation. Now Nvidia has their own cloud offering that "democratize access to world-class AI infrastructure. Sovereign AI initiatives require a new standard for transparency and performance".

amadeuspagel

5 days ago

I can't run google on my computer on my own, but I'm totally dependent on it.

_heimdall

5 days ago

Is your entire job returning google results?

The point being made here is that a developer that can only do their primary job of coding via a hosted LLM is entirely dependent on a third party.

scotty79

5 days ago

How much of useful programming work are you able to do without google? I don't think I even tried to do do any for the last 20 years.

You make a good point of course that independence is important. But primo, this ship sailed long ago, secundo, more than one party provides the service you depend on. If one failes you still have at least some alternatives.

ARandumGuy

5 days ago

I guess it depends on how you define "using google", but as I've progressed as a developer, I've found myself spending less time googling problems, and more time just looking at official documentation (maybe GitHub issues if I'm doing something weird). And yeah, I guess I technically use google to get to the documentation instead of typing in the URL, but that feels like splitting hairs.

And it's not like people weren't able to develop complicated software before the internet. They just had big documentation books that cost money and could get dated quickly. To be clear, having that same info a quick google search away is an improvement, and I'm not going to stop using google while it's available to me. But that doesn't mean we'd all be screwed if google stopped existing tomorrow.

horsawlarway

5 days ago

Takes a little adjustment, but you can do quite a bit of good work without Google (or any search).

Spoken from a fair bit of experience doing software development in closed rooms with strict control of all digital devices (from your phone to your watch) and absolutely no external connections.

There are moments that are painful still, because you'll be trying to find a thing in a manual and you know a search can get it faster - but it's silly to imply this isn't possible.

scotty79

5 days ago

I know it's possible with proper preparation. But I never had the strong need to prepare.

_heimdall

5 days ago

I can't say I've ever tried to intentionally not use google when working, unless I'm wanting to work offline entirely.

That said I only find google results somewhat helpful. Its a lot like LLM code (not surprising given how they're trained), I may find 5 answers online and one or two has a small piece of what I need. Ultimately that may say me a bit of time or give me an idea for something I hadn't thought of, but it isn't core to my daily work by any stretch.

a_wild_dandan

5 days ago

Which developer jobs aren't dependent on many third parties?

_heimdall

5 days ago

There's a difference in being dependent on parts of the job versus the entire job.

I mostly write JS today and it either runs in browsers (dependencies) or a host like AAwS (dependencies). I use VS Codium and a handful of plugins (dependencies).

These all help me work efficiently when I'm coding, or help me avoid infrastructure issues that I don't want to deal with. Any one part is replaceable though, and more importantly any one part isn't responsible for doing my entire job of creating and shipping code.

whobre

5 days ago

I did code before Google, and I was fine. Yes, it's really convenient, and LLM would be even more convenient if I could trust it just a little bit more, but it's quite possible to do some effective software development without Google.

teeray

5 days ago

In 8th grade, I had a little PHP 4 pocket reference book. In classes I didn’t care about, I would have this open inside the book for that class and I would write code for my forums on loose leaf (in a shorthand). I also had printed copies of Mercury Board source code to refer to in the back of my binder. Then I’d get home later, type it in, debug a bit, and have new features :) It’s an entirely alien analog process to modern students, I’m sure, but it was really effective!

zelphirkalt

5 days ago

There are many alternatives though. It is not like Google has a search monopoly or office product monopoly, or e-mail provider monopoly. It is quite possible to cut out a lot of Google from one's life, and not even complicated to do that.

pkilgore

5 days ago

Is your argument there are no LLM alternatives?

zelphirkalt

5 days ago

Not really, no. Though I would argue if Google disappeared tomorrow, as a private person you would probably do mostly fine. The point being, that your dependence is most likely not that strong actually. Unless you got important mail arriving at only your gmail mailbox. That would be dangerous. I lost several good accounts on other websites that way in the past. Now I don't register anything useful on gmail addresses any longer, in fact don't actually use gmail any longer, unless I still have some old accounts that I still didn't migrate away out of laziness.

79a6ed87

5 days ago

>To those of us who like Free Software because of the freedom it gives us, this is a severe regression.

It's fair to be worried about depending on LLM. But I find the dependance on things like AWS or Azure more problematic, if we are talking about centralized and proprietary

Aeolun

5 days ago

It's not like the code is suddenly elsewhere right? If the LLM disappears I'll be annoyed, not helpless.

nessbot

5 days ago

Not if they only way you know how to code is vibe coding.

brailsafe

5 days ago

Well, I'd think of it like being car-dependent. Sure, plenty of suburbanites know how to walk, they still have feet, but they live somewhere that's designed to only be practically traversable by car. While you've lived that lifestyle, you may have gained weight and lost muscle mass, or developed an intolerance for discomfort to a point where it poses real problems. If you never got a car, or let yourself adapt to life without one, you have to work backwards from that constraint. Likewise with the built environment around us; the cities many people under the age of 40 consider to be "good" are the ones that didn't demolish themselves in the name of highways and automobiles, in which a car only rarely presents what we'd think of as useful technology.

There are all kinds of trades that the car person and the non-car person makes for better or worse depending on the circumstance. The non-car person may miss out on a hobby, or not know why road trips are neat, but they don't have the massive physical and financial liabilities that come with them. The car person meanwhile—in addition to the aforementioned issues—might forget how to grocery shop in smaller quantities, or engage with people out in the world because they just go from point A to B in their private vessel, but they may theoretically engage in more distant varied activities that the non-car person would have to plan for further in advance.

Taking the analogy a step further, each party gradually sets different standards for themselves that push the two archetypes into diametrically opposed positions. The non-car owner's life doesn't just not depend on cars, but is often actively made worse by their presence. For the car person, the presence of people, especially those who don't use a car, gradually becomes over-stimulating; cyclists feel like an imposition, people walking around could attack at any moment, even other cars become the enemy. I once knew someone who'd spent his whole life commuting by car, and when he took a new job downtown, had to confront the reality that not only had he never taken the train, he'd become afraid of taking it.

In this sense, the rise of LLM does remind of the rise of frontend frameworks, bootcamps thay started with React or React Native, high level languages, and even things like having great internet; the only people who ask what happens in a less ideal case are the ones who've either dealt with those constraints first-hand, or have tried to simulate it. If you've never been to the countryside, or a forest, or a hotel, you might never consider how your product responds in a poor connectivity environment, and these are the people who wind up getting lost on basic hiking trails having assumed that their online map would produce relevant information and always be there.

Edit: To clarify, in the analogy, it's clear that cars are not intrinsically bad tools or worthwhile inventions, but had excitement for them been tempered during their rise in commodification and popularity, the feedback loops that ended up all but forcing people to use them in certain regions could have been broken more easily.

marcofloriano

2 days ago

Best observation so far. Specially the cost side of using all those APIs ... i pay in dollars, but earn in reais (brazil), the cost scares me.

keutoi

5 days ago

I think the same argument could be made about search engines. Most people are not too worried about them.

thayne

5 days ago

Maybe they should be.

rerdavies

5 days ago

You could stop using them, I suppose.

schaefer

5 days ago

"Say, have you seen any good websites lately?" Is a conversation I've literally never started.

And it feels strange, because I am constantly asking people what books they're reading.

benced

5 days ago

You can run LLMs locally pretty easily, especially if you have a Mac (the unified memory architecture of Macs is really good at this). It's a niche thing but caring about Free Software is niche.

rco8786

5 days ago

> Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work.

To be fair, the entire internet is basically this already.

sanex

5 days ago

You think an LLM provider has a bigger moat than an IDE (say pre vs code for a better parallel). MSDN and Jetbrains licenses are far more expensive than Cursor or Windsurf.

nkotov

5 days ago

The truth is that majority of people do not care about this. It's why AWS exists. It's why Fly.io exists.

Flemlo

5 days ago

I think 384gb of ram is surprisingly reasonable tbh.

200-300$/month are already 7k in 3 years.

And I do expect some hardware chip based models in a few years like a GPU.

AiPU we're you can replace the hardware ai chip.

BoiledCabbage

5 days ago

> I think 384gb of ram is surprisingly reasonable tbh.

> 200-300$/month are already 7k in 3 years.

Except at current crazy rates of improvement, cloud based models will in reality likely be ~50x better, and you'll still have the same system.

simonw

5 days ago

I've had the same system (M2 64GB MacBook Pro) for three years.

2.5 years ago it could just about run LLaMA 1, and that model sucked.

Today it can run Mistral Small 3.1, Gemma 3 27B, Llama 3.3 70B - same exact hardware, but those models are competitive with the best available cloud-hosted model from two years ago (GPT-4).

The best hosted models (o3, Claude 4, Gemini 2.5 etc) are still way better than the best models I can run on my 3-year-old laptop, but the rate of improvements for those local models (on the same system) has been truly incredible.

Flemlo

5 days ago

I'm surprised that it's even possible running big models locally.

I agree we will see how this plays out but I hope models might start to become more efficient and it might not matter that much for certain things to run some parts locally.

I could imagine a LLM model with a lot less languages and optimized for one programming language to happen. Like 'generaten your model'

bravesoul2

5 days ago

Yes LLMs are a funny workload. They require high amounts if processing but are very bursty.

Therefore using your own bare metal is a low of expensive redundancy.

For the cloud provider they can utilise the GPU to make it pay. They can also subsidise it with VC money :)

ImaCake

6 days ago

>the dependence.

Sure, but that is not the point of the article. LLMs are useful. The fact that you are dependent on someone else is a different problem like being dependent on microsoft for your office suite.

imhoguy

5 days ago

Even FOSS-based development depends on walled gardens, it is evident every time when GitHub is down.

zelphirkalt

5 days ago

Sensibly hosted FOSS doesn't go to GitHub for hosting though. There are other options for people who care. I personally like Codeberg.

neop1x

5 days ago

IMO Github doesn't matter for FOSS because you have a lot of local clones, it won't disappear forever if Github goes down or deletes the repo there. Self-hosted alts are not 100% up either. And I actually find collaboration functions / easy PR contribution on Github highly beneficial. At the same time I hate the friction of all those private Gitlabs, Giteas or, God forbid, gitweb.

skydhash

5 days ago

> And I actually find collaboration functions / easy PR contribution on Github highly beneficial. At the same time I hate the friction of all those private Gitlabs, Giteas or, God forbid, gitweb.

FOSS is more about:

1. Finding some software you can use for your problem

2. Have an issue for your particular use case

3. Download the code and fix the issue.

4. Cleanup the patch and send a proposal to the maintainer. PR is easy, but email is ok. You can even use a pastebin service and post it on a forum (suckless does that in part).

5. The maintainer merges the patch and you can revert to the official version, or they don't and you decides to go with your fork.

peab

5 days ago

do you use GitHub? Vs code? GCP or AWS? The internet, perhaps?All work is dependent on other services, that's the modern world

mrheosuper

5 days ago

I disagree.

Self-hosting has always have a lot of drawbacks compared with commercial solutions. I bet my self-host file server has worse reliability than Google Drive, or my self-host git server has worse number of concurrent user than github.

It's one thing you must accept when self-host.

So when you self-host LLM, you must either accept a drop in output quality, or spend a small fortune on hardware

kortilla

5 days ago

Those aren’t good analogies because it costs nearly nothing to make that availability tradeoff and run things on your computer for your own fun.

Raspberry pi was a huge step forward, the move to LLMs is two steps back.

wiseowise

5 days ago

Wake up, you’re already dependent on everything, unless you stick exclusively to Python std and no outside batteries.

Maven central is gone and you have no proxy setup or your local cache is busted? Poof, you’re fucking gone, all your Springs, Daggers, Quarkuses and every third party crap that makes up your program is gone. Same applies to bazillion JS, Rust libraries.

pxnicksm

5 days ago

There are multiple organizations with mirrors for packages, and I doubt if the cost of a mirror is the same as a cost of 384GB memory server.

A guy says here you need 4TB for a PyPi mirror, 285 GB for npm

https://stackoverflow.com/questions/65995150/is-it-possible-...

wolvesechoes

5 days ago

If PyPI goes out and I cannot use NumPy, I can still roll-out my own implementation of linear algebra library, because I've got the required knowledge, and I've got it because I had to learn it instead rely on LLMs.

roywiggins

5 days ago

I definitely couldn't, because I just use NumPy. Maybe I could roll my own matrix multiplication, but anything more would require cracking open a textbook that I haven't looked at for a decade. And this was true before I touched an LLM.

miloignis

5 days ago

Panamax works great for mirroring all of crates.io in 300-400GB, which is big but easily small enough for enthusiasts. I've got it on an external USB drive myself, and it's saved my bacon a few times.

We're not yet to that same point for performance of local LLM models afaict, though I do enjoy messing around them.

gdubs

6 days ago

One thing that I find truly amazing is just the simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return. Like, as someone who grew up learning to code in the 90s it always seemed like science fiction that we'd get to a point where you could give a computer some vague human level instructions and get it more or less do what you want.

forgotoldacc

5 days ago

There's the old quote from Babbage:

> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.

It is kind of impressive how I'll ask for some code in the dumbest, vaguest, sometimes even wrong way, but so long as I have the proper context built up, I can get something pretty close to what I actually wanted. Though I still have problems where I can ask as precisely as possible and get things not even close to what I'm looking for.

kibwen

5 days ago

> This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.

This is not the point of that Babbage quote, and no, LLMs have not solved it, because it cannot be solved, because "garbage in, garbage out" is a fundamental observation of the limits of logic itself, having more to with the laws of thermodynamics than it does with programming. The output of a logical process cannot be more accurate than the inputs to that process; you cannot conjure information out of the ether. The LLM isn't the logical process in this analogy, it's one of the inputs.

rcxdude

5 days ago

At a fundamental level, yes, and even in human-to-human interaction this kind of thing happens all the time. The difference is that humans are generally quite good at resolving most ambiguities and contradictions in a request correctly and implicitly (sometimes surprisingly bad at doing so explicitly!). Which is why human language tends to be more flexible and expressive than programming languages (but bad at precision). LLMs basically can do some of the same thing, so you don't need to specify all the 'obvious' implicit details.

kibwen

5 days ago

The Babbage anecdote isn't about ambiguous inputs, it's about wrong inputs. Imagine wanting to know the answer to 2+2, so you go up to the machine and ask "What is 3+3?", expecting that it will tell you what 2+2 is.

Adding an LLM as input to this process (along with an implicit acknowledgement that you're uncertain about your inputs) might produce a response "Are you sure you didn't mean to ask what 2+2 is?", but that's because the LLM is a big ball of likelihoods and it's more common to ask for 2+2 than for 3+3. But it's not magic; the LLM cannot operate on information that it was not given, rather it's that a lot of the information that it has was given to it during training. It's no more a breakthrough of fundamental logic than Google showing you results for "air fryer" when you type in "air frier".

simonask

5 days ago

I think the point they’re making is that computers have traditionally operated with an extremely low tolerance for errors in the input, where even minor ambiguities that are trivially resolved by humans by inferring from context can cause vastly wrong results.

We’ve added context, and that feels a bit like magic coming from the old ways. But the point isn’t that there is suddenly something magical, but rather that the capacity for deciphering complicated context clues is suddenly there.

skydhash

5 days ago

> computers have traditionally operated with an extremely low tolerance for errors in the input

That's because someone have gone out of their way to mark those inputs as errors because they make no sense. The CPU itself has no qualms doing 'A' + 10 because what it's actually sees is a request is 01000001 (65) as 00001010 (10) as the input for its 8 bit adder circuit. Which will output 01001011 (75) which will be displayed as 75 or 'k' or whatever depending on the code afterwards. But generally, the operation is nonsense, so someone will mark it as an error somewhere.

So errors are a way to let you know that what you're asking is nonsense according to the rules of the software. Like removing a file you do not own. Or accessing a web page that does not exists. But as you've said, we can now rely on more accurate heuristics to propose alternatives solution. But the issue is when the machine goes off and actually compute the wrong information.

Kye

5 days ago

Handing an LLM a file and asking it to extract data out of it with no further context or explanation of what I'm looking for with good results does feel a bit like the future. I still do add context just to get more consistent results, but it's neat that LLMs handle fuzzy queries as well as they do.

make3

2 days ago

in this case the LLM uses context clues and commonality priors to find the closest correct input, which is definitely relevant

CobrastanJorji

5 days ago

We wanted to check the clock at the wrong time but read the correct time. Since a broken clock is right twice a day, we broke the clock, which solves our problem some of the time!

pca006132

5 days ago

The nice thing is that a fully broken clock is accurate more often than a slightly deviated clock.

antifa

4 days ago

A clock that's 5 seconds, 5 minutes, or 5 hours ahead, or counts an hour as 61 minutes, is still more useful than a clock that does not move it's hands at all.

teddyh

5 days ago

Only if the deviated clock is fast. If a clock is, instead, slow, it is correct more often than a stopped clock.

meowface

5 days ago

It is fun to watch. I've sometimes indeed seen the LLM say something like "I'm assuming you meant [X]".

nitwit005

5 days ago

It's very impressive that I can type misheard song lyrics into Google, and yet still have the right song pop up.

But, having taken a chance to look at the raw queries people type into apps, I'm afraid neither machine nor human is going to make sense of a lot of it.

CrimsonRain

5 days ago

theseday,s i ofen donot correct my typos even wheni notice them while cahtting with LLMS. So far 0 issues.

ivape

5 days ago

We're talking about God function.

function God (any param you can think of) {

}

jajko

5 days ago

Well, you can enter 4-5 relatively vague keywords into google and first/second stackoverflow link will probably provide plenty of relevant code. Given that, its much less impressive since >95% of the problems and queries just keep repeating.

godelski

5 days ago

How do you know the code is right?

fsloth

5 days ago

The program behaves as you want.

No, really - there is tons of potentially value-adding code that can be of throwaway quality just as long as it’s zero effort to write it.

Design explorations, refactorings, erc etc.

godelski

5 days ago

And how do you know it behaves like you want?

This is a really hard problem when I write every line and have the whole call graph in my head. I have no clue how you think this gets easier by knowing less about the code

theshrike79

5 days ago

Tests pretty much. Not a silver bullet for everything, but works for many cases.

Unless you're a 0.1% coder, your mental call graph can't handle every corner case perfectly anyway, so you need tests too.

godelski

5 days ago

No one is saying you shouldn't write tests. But we are saying TDD is dumb.

Actually, for exactly the reasons you mention: I'm not dumb enough to believe I'm a genius. I'll always miss something. So I can't rely on my tests to ensure correctness. It takes deeper thought and careful design.

fsloth

5 days ago

By using the program? Mind you this works only for _personal_ tools where it’s intuitively obvious when something is wrong.

For example

”Please create a viewer for geojson where i can select individual feature polygons and then have button ’export’ that exports the selected features to a new geojson”

1. You run it 2. It shows the json and visualizes selections 3. The exported subset looks good

I have no idea how anyone could keep the callgraph of even a minimal gui application in their head. If you can then congratulations, not all of us can!

godelski

5 days ago

Great, I used my program and everything seems to be working as expected.

Not great, somebody else used my program and they got root on my server...

  > I have no idea how anyone could keep the callgraph of even a minimal gui application in their head

Practice.

Lots and lots of practice.

Write it down. Do things the hard way. Build the diagrams by hand and make sure you know what's going on. Trace programs. Pull out the debugger! Pull out the profiler!

If you do those things, you too will gain that skill. Obviously you can't do this for a giant program but it is all about the resolution of your call graph anyways.

If you are junior, this is the most important time to put in that work. You will get far more from it than you lose. If you're further along, well the second best time to plant a tree is today.

fsloth

4 days ago

”not great, somebody else used my program and they got root on my server...”

In general security sensitive software is the worst place possible to use LLM:s based on public case studies and anecdata exactly for this reason.

”Do it the hard way”

Yes that’s generally the way I do it as well when I need to reliably understand something but it takes hours.

The cadence with LLM driven experiments is usually under an hour. That’s the biggest boom for me - I get a new tool and can focus on the actual work I’m delivering, with some step now taking slightly less time.

For example I’m happy using vim without ever having read the code or debugged it, much less having observed it’s callgraph. I’m similarly content in using LLM generated utilities without much oversight. I would never push code like that to production of course.

etherealG

4 days ago

how do you know what you want if you didn't write a test for it?

I'm afraid what you want is often totally unclear until you start to use a program and realize that what you want is either what the program is doing, or it isn't and you change the program.

MANY programs are made this way, I would argue all of them actually. Some of the behaviour of the program wasn't imagined by the person making it, yet it is inside the code... it is discovered, as bugs, as hidden features, etc.

Why are programmers so obsessed that not knowing every part of the way a program runs means we can't use the program? I would argue you already don't, or you are writing programs that are so fundamentally trivial as to be useless anyway.

LLM written code is just a new abstraction layer, like Python, C, Assembly and Machine Code before it... the prompts are now the code. Get over it.

godelski

4 days ago

  > how do you know what you want if you didn't write a test for it?

You have that backwards.

How do you know what to test if you don't know what you want?

I agree with you though, you don't always know what you want when you set out. You can't just factorize your larger goal into unit tests. That's my entire point.

You factorize by exploration. By play. By "fuck around and find out". You have to discover the factorization.

And that, is a very different paradigm than TDD. Both will end with tests, and frankly, the non TDD paradigm will likely end up with more tests with better coverage.

  > Why are programmers so obsessed that not knowing every part of the way a program runs means we can't use the program?

I think you misunderstand. I want to compare it to something else. There's a common saying "don't let perfection be the enemy of good (enough)". I think it captures what you're getting at, or is close enough.

The problem with that saying is that most people don't believe in perfection[0]. The problem is, perfection doesn't exist. So the saying ends up being a lazy thought terminator instead of addressing the real problem: determining what is good enough.

In fact, no one knows every part of even a trivial program. We can always introduce more depth and complexity until we reach the limits of our physics models and so no one knows. Therefore, you'll have to reason it is not about perfection.

I think you are forgetting why we program in the first place. Why we don't just use natural language. It's the same reason we use math in science. Not because math is the language of the universe but rather that math provides enough specificity to be very useful in describing the universe.

This isn't about abstraction. This is about specification.

It's the same problem with where you started. The customer can't tell my boss their exact requirements and my boss can't perfectly communicate to me. Someone somewhere needs to know a fair amount of details and that someone needs to be very trustworthy.

I'll get over it when the alignment problem is solved to a satisfactory degree. Perfection isn't needed, we will have you discuss what is good enough and what is not

[0] likely juniors. And it should be beat out of them. Kindly

ic_fly2

5 days ago

The LLM generated unit tests pass. Obviously!

godelski

5 days ago

It seems most people are making this answer but without the sarcasm...

lazide

5 days ago

Just don’t look at the generated unit tests, and we’re fine.

dkdbejwi383

5 days ago

If customers don’t complain it must be working

godelski

5 days ago

You don't hear the complaints. That's different than no complaints. Trust me, they got them.

I got plenty of complaints for Apple, Google, Netflix, and everyone else. Shit that can be fixed with just a fucking regex. Here's an example: my gf is duplicated in my Apple contacts. It can't find the duplicate, despite same name, nickname, phone number, email, and birthday. Which there's three entries on my calendar for her birthday. Guess what happened when I manually merged? She now has 4(!!!!!) entries!! How the fuck does that increase!

Trust me, they complain, you just don't listen

coliveira

5 days ago

Sure, you can now be fuzzy with the input you give to computers, but in return the computer will ALSO be fuzzy with the answer it gives back. That's the drawback of modern AI.

rienbdj

5 days ago

It can give back code though. It might be wrong, but it won’t be ambiguous.

swiftcoder

5 days ago

> It can give back code though. It might be wrong, but it won’t be ambiguous.

Code is very often ambiguous (even more so in programming languages that play fast and loose with types).

Relative lack of ambiguity is a very easy way to tell who on your team is a senior developer

0points

5 days ago

When it don't even compile or have clear intent, it's ambiguous in my book.

isolli

5 days ago

The output is also often quite simple to check...

rienbdj

5 days ago

For images and other media, yes. Does it look right?

Program correctness is incredibly difficult - arguably the biggest problem in the industry.

csallen

6 days ago

It's mind blowing. At least 1-2x/week I find myself shocked that this is the reality we live in

malfist

6 days ago

Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.

I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.

I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?

It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.

Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools

Game_Ender

5 days ago

What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).

0 - https://chatgpt.com/share/683e3807-0bf8-800a-8bab-5089e4af51...

1 - https://chatgpt.com/share/683e3558-6738-800a-a8fb-3adc20b69d...

wkat4242

5 days ago

The problem is the same prompt will yield good results one time and bad results another. The "get better at prompting" is often just an excuse for AI hallucination. Better prompting can help but often it's totally fine, the tech is just not there yet.

Workaccount2

5 days ago

While this is true, I have seen this happen enough times to confidently bet all my money that OP will not return and post a link to their incorrect ChatGPT response.

Seemingly basic asks that LLMs consistently get wrong have lots of value to people because they serve as good knowledge/functionality tests.

malfist

5 days ago

I don't have to post my chat, someone else already posted a chat claiming ChatGPT gave them correct answers when the answers ChatGPT gave them were all kinds of wrong.

See: https://news.ycombinator.com/item?id=44164633 and my analysis of the results: https://news.ycombinator.com/item?id=44171575

You can send me all your money via paypal, money order or check.

Workaccount2

5 days ago

I'm not gonna go all out, this thread is gonna be dead soon but here all the toothpastes ChatGPT was referring to

[1]https://dentalhealth.com/products/fluoridex-sensitivity-reli...

[2]https://www.fireflysupply.com/products/hello-naturally-white...

[3]https://dailymed.nlm.nih.gov/dailymed/fda/fdaDrugXsl.cfm?set...

(Seems toms recently discontinued this, they mention it on their website, but say customers didn't like it)

[4]https://www.jason-personalcare.com/product/sea-fresh-anti-ca...

As far as I can tell these are all real products and all meet the requirement of having fluoride and being SLS free.

Since you did return however and that was half my bet, I suppose you are still entitled to half my life savings. But the amount is small so maybe the knowledge of these new toothpastes is more valuable to you anyway.

Aeolun

5 days ago

If you want a correct answer the first time around, and give up if you don't get it, even if you know the thing can give it to you with a bit more effort (but still less effort than searching yourself), don't you think that's a user problem?

3eb7988a1663

5 days ago

If you are genuinely asking a question, how are you supposed to know the first answer was incorrect?

leoedin

5 days ago

I briefly got excited about the possibility of local LLMs as an offline knowledge base. Then I tried asking Gemma for a list of the tallest buildings in the world and it just made up a bunch. It even provided detailed information about the designers, year of construction etc.

I still hope it will get better. But I wonder if an LLM is the right tool for factual lookup - even if it is right, how do I know?

I wonder how quickly this will fall apart as LLM content proliferates. If it’s bad now, how bad will it be in a few years when there’s loads of false but credible LLM generated blogspam in the training data?

galaxyLogic

5 days ago

That's the beauty of using AI to generate code: All code is "fictional".

mulmen

5 days ago

> I wonder how quickly this will fall apart as LLM content proliferates. If it’s bad now, how bad will it be in a few years when there’s loads of false but credible LLM generated blogspam in the training data?

There is already misinformation online so only the marginal misinformation is relevant. In other words do LLMs generate misinformation at a higher rate than their training set?

For raw information retrieval from the training set misinformation may be a concern but LLMs aren’t search engines.

Emergent properties don’t rely on facts. They emerge from the relationship between tokens. So even if an LLM is trained only on misinformation abilities may still emerge at which point problem solving on factual information is still possible.

socalgal2

5 days ago

The person that started this conversation verified the answers were incorrect. So it sounds like you just do that. Check the results. If they turn out to be false, tell the LLM or make sure you're not on a bad one. It still likely to be faster than searching yourself.

mtlmtlmtlmtl

5 days ago

That's all well and good for this particular example. But in general, the verification can often be so much work it nullifies the advantage of the LLM in the first place.

Something I've been using perplexity for recently is summarizing the research literature on some fairly specific topic(e.g. the state of research on the use of polypharmacy in treatment of adult ADHD). Ideally it should look up a bunch of papers, look at them and provide a summary of the current consensus on the topic. At first, I thought it did this quite well. But I eventually noticed that in some cases it would miss key papers and therefore provide inaccurate conclusions. The only way for me to tell whether the output is legit is to do exactly what the LLM was supposed to do; search for a bunch of papers, read them and conclude on what the aggregate is telling me. And it's almost never obvious from the output whether the LLM did this properly or not.

The only way in which this is useful, then, is to find a random, non-exhaustive set of papers for me to look at(since the LLM also can't be trusted to accurately summarize them). Well, I can already do that with a simple search in one of the many databases for this purpose, such as pubmed, arxiv etc. Any capability beyond that is merely an illusion. It's close, but no cigar. And in this case close doesn't really help reduce the amount of work.

This is why a lot of the things people want to use LLMs for requires a "definiteness" that's completely at odds with the architecture. The fact that LLMs are food at pretending to do it well only serves to distract us from addressing the fundamental architectural issues that need to be solved. I think think any amount of training of a transformer architecture is gonna do it. We're several years into trying that and the problem hasn't gone away.

csallen

5 days ago

> The only way for me to tell whether the output is legit is to do exactly what the LLM was supposed to do; search for a bunch of papers, read them and conclude on what the aggregate is telling me. And it's almost never obvious from the output whether the LLM did this properly or not.

You're describing a fundamental and inescapable problem that applies to literally all delegated work.

mtlmtlmtlmtl

5 days ago

Sure, if you wanna be reductive, absolutist and cynical about it. What you're conveniently leaving out though is that there are varying degrees of trust you can place in the result depending on who did it. And in many cases with people, the odds they screwed it up are so low they're not worth considering. I'm arguing LLMs are fundamentally and architecturally incapable of reaching that level of trust, which was probably obvious to anyone interpreting my comment in good faith.

csallen

5 days ago

I think what you're leaving is that what you're applying to people also applies to LLMs. There are many people you can trust to do certain things but can't trust to do others. Learning those ropes requires working with those people repeatedly, across a variety of domains. And you can save yourself some time by generalizing people into groups, and picking the highest-level group you can in any situation, e.g. "I can typically trust MIT grads on X", "I can typically trust most Americans on Y", "I can typically trust all humans on Z."

The same is true of LLMs, but you just haven't had a lifetime of repeatedly working with LLMs to be able to internalize what you can and can't trust them with.

Personally, I've learned more than enough about LLMs and their limitations that I wouldn't try to use them to do something like make an exhaustive list of papers on a subject, or a list of all toothpastes without a specific ingredient, etc. At least not in their raw state.

The first thought that comes to mind is that a custom LLM-based research agent equipped with tools for both web search and web crawl would be good for this, or (at minimum) one of the generic Deep Research agents that's been built. Of course the average person isn't going to think this way, but I've built multiple deep research agents myself, and have a much higher understanding of the LLMs' strengths and limitations than the average person.

So I disagree with your opening statement: "That's all well and good for this particular example. But in general, the verification can often be so much work it nullifies the advantage of the LLM in the first place."

I don't think this is a "general problem" of LLMs, at least not for anyone who has a solid understanding of what they're good at. Rather, it's a problem that comes down to understanding the tools well, which is no different than understanding the people we work with well.

P.S. If you want to make a bunch of snide assumptions and insults about my character and me not operating in good faith, be my guest. But in return I ask you to consider whether or not doing so adds anything productive to an otherwise interesting conversation.

lazide

5 days ago

Yup, and worse since the LLM gives such a confident sounding answer, most people will just skim over the ‘hmm, but maybe it’s just lying’ verification check and move forward oblivious to the BS.

fennecbutt

5 days ago

People did this before LLMs anyway. Humans are selfish, apathetic creatures and unless something pertains to someone's subject of interest the human response is "huh, neat. I didn't know dogs could cook pancakes like that" then scroll to the next tiktok.

This is also how people vote, apathetically and tribally. It's no wonder the world has so many fucking problems, we're all monkeys in suits.

lazide

5 days ago

I think that’s my point. It enables exactly the worse behavior in the worst way, knowledge wise.

malfist

4 days ago

Sure, but there's degrees in the real world. Do people sometimes spew bullshit (hallucinate) at you? Absolutely. But LLMs, that's all they do. They make bullshit and spew it. That's their default state. They're occasionally useful despite this behavior, but it doesn't mean that they're not still bullshitting you

Tarq0n

5 days ago

I'd be very interested in hearing what conclusions you came to in your research, if you're willing to share.

lechatonnoir

5 days ago

I somehow can't reply to your child comment.

It depends on whether the cost of search or of verification dominates. When searching for common consumer products, yeah, this isn't likely to help much, and in a sense the scales are tipped against the AI for this application.

But if search is hard and verification is easy, even a faulty faster search is great.

I've run into a lot of instances with Linux where some minor, low level thing has broken and all of the stackexchange suggestions you can find in two hours don't work and you don't have seven hours to learn about the Linux kernel and its various services and their various conventions in order to get your screen resolutions correct, so you just give up.

Being in a debug loop in the most naive way with Claude, where it just tells you what to try and you report the feedback and direct it when it tunnel visions on irrelevant things, has solved many such instances of this hopelessness for me in the last few years.

skydhash

5 days ago

So instead of spending seven hours to get at least an understanding how the Linux kernel work and the interaction of various user-land programs, you've decided to spend years fumbling in the dark and trying stuff every time an issue arises?

lechatonnoir

5 days ago

I would like to understand how you ideally imagine a person solving issues of this type. I'm for understanding things instead of hacking at them in general, and this tendency increases the more central the things to understand are to the things you like to do. However, it's a point of common agreement that just in the domain of computer-related tech, there is far more to learn than a person can possibly know in a lifetime, and so we all have to make choices about which ones we want to dive into.

I do not expect to go through the process I just described for more than a few hours a year, so I don't think the net loss to my time is huge. I think that the most relevant counterfactual scenario is that I don't learn anything about how these things work at all, and I cope with my problem being unfixed. I don't think this is unusual behavior, to the degree that it's I think a common point of humor among Linux users: https://xkcd.com/963/ https://xkcd.com/456/

This is not to mention issues that are structurally similar (in the sense that search is expensive but verification is cheap, and the issue is generally esoteric so there are reduced returns to learning) but don't necessarily have anything to do with the Linux kernel: https://github.com/electron/electron/issues/42611

I wonder if you're arguing against a strawman that thinks that it's not necessary to learn anything about the basic design/concepts of operating systems at all. I think knowledge of it is fractally deep and you could run into esoterica you don't care about at any level, and as others in the thread have noted, at the very least when you are in the weeds with a problem the LLM can often (not always) be better documentation than the documentation. (Also, I actually think that some engineers do on a practical level need to know extremely little about these things and more power to them, the abstraction is working for them.)

Holding what you learn constant, it's nice to have control about in what order things force you to learn them. Yak-shaving is a phenomenon common enough that we have a term for it, and I don't know that it's virtuous to know how to shave a yak in-depth (or to the extent that it is, some days you are just trying to do something else).

skydhash

5 days ago

More often than not, the actual implementation is more complex than the theory that outlines it (think Turing Machine and today's computer). Mostly because the implementation is often the intersection of several theories spanning multiple domain. Going at a problem at a whole is trying to solve multiple equations with a lot of variables and it's an impossible task for most. Learning about all the domains is also a daunting tasks (and probably fruitless as you've explained it).

But knowing the involved domain and some basic knowledge is easy to do and more than enough to quickly know where to do a deep dive. Instead of relying on LLMs that are just giving plausible mashup on what was on their training data (which is not always truthful).

insane_dreamer

5 days ago

> It still likely to be faster than searching yourself.

No, not if you have to search to verify their answers.

worthless-trash

5 days ago

This is the right question.

graphememes

5 days ago

scientific method??

0points

5 days ago

> don't you think that's a user problem?

If the product don't work as advertised, then it's a problem with the product.

xtracto

5 days ago

I still remember when Altavista.digital and excite.com where brand new. They were revolutionary and very useful,even if they couldn't find results for all the prompts we made.

rsynnott

5 days ago

I am unconvinced that searching for this yourself is actually more effort than repeatedly asking the Mighty Oracle of Wrongness and cross-checking its utterances.

malfist

5 days ago

You say it's successful, but in your second prompt is all kinds of wrong.

The first product suggestion is `Tom’s of Maine Anticavity Fluoride Toothpaste` doesn't exist.

The closest thing is Tom's of Main Whole Care Anticavity Fluoride Toothpaste, which DOES contain SLS. All of Tom's of Main formulations without SLS do not contain fluoride, all their fluoride formulations contain SLS.

The next product it suggests is "Hello Fluoride Toothpaste" again, not a real product. There is a company called "Hello" that makes toothpastes, but they don't have a product called "Hello fluoride Toothpaste" nor do the "e.g." items exist.

The third product is real and what I actually use today.

The fourth product is real, but it doesn't contain fluoride.

So, rife with made up products, and close matches don't fit the bill for the requirements.

jvanderbot

5 days ago

This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things from the majority of users seems to be junk and the messaging designed around FUD and FOMO

mediaman

5 days ago

Just like some people who wrote long sentences into Google in 2000 and complained it was a fad.

Meanwhile the rest of the world learned how to use it.

We have a choice. Ignore the tool or learn to use it.

(There was lots of dumb hype then, too; the sort of hype that skeptics latched on to to carry the burden of their argument that the whole thing was a fad.)

spaqin

5 days ago

Arguably, the people who typed long sentences into Google have won; the people who learned how to use it early on with specific keywords now get meaningless results.

HappMacDonald

5 days ago

Nah, both keywords and long sentences get meaningless results from Google these days (including their falsely authoritative Bard claims).

I view Bard as a lot like the yesman lacky that tries to pipe in to every question early, either cheating off other's work or even more frequently failing to accurately cheat off of other's work, largely in hopes that you'll be in too much of a hurry to mistake it's voice for that of another (eg, mistake the AI breakdown for a first hit result snippet) and faceplant as a result of their faulty intel.

Gemini gets me relatively decent answers .. only after 60 seconds of CoT. Bard answers in milliseconds and its lack of effort really shows through.

Filligree

5 days ago

Just to nitpick: The AI results on google search are Magi (a much smaller model), not Gemini.

And definitely not Bard, because that no longer exists, to my annoyance. It was a much better name.

johnecheck

5 days ago

That was a pretty funny little maneuver from Google.

Google: Look at our new chatbot! It's called Bard, and it's going to blow ChatGPT out of the water!

Bard: Hallucinates JWST achievements when prompted for an ad.

Google: Doesn't fact check, posts the ad

Alphabet stock price: Drops 16% in a week

Google: Look at our new chatbot! It's called Gemini, and it's going to blow ChatGPT out of the water!

windexh8er

5 days ago

> Meanwhile the rest of the world learned how to use it.

Very few people "learned how to use" Google, and in fact - many still use it rather ineffectively. This is not the same paradigm shift.

"Learning" ChatGPT is not a technology most will learn how to use effectively. Just like Google they will ask it to find them an answer. But the world of LLMs is far broader with more implications. I don't find the comparison of search and LLM at an equal weight in terms of consequences.

The TL;DR of this is ultimately: understanding how to use an LLM, at it's most basic level, will not put you in the drivers seat in exactly the same way that knowing about Google also didn't really change anything for anyone (unless you were an ad executive years later). And in a world of Google or no-Google, hindsight would leave me asking for a no-Google world. What will we say about LLMs?

pigeons

5 days ago

And just like google, the chatgpt system you are interfacing with today will have made silent changes to its behavior tomorrow and the same strategy will no longer be optimal.

kristofferR

5 days ago

The AI skeptics are the ones who never develop the skill though, it's self-destructive.

jvanderbot

5 days ago

People treat this as some kind of all or nothing. I _do_ us LLM/AI all the time for development, but the agentic "fire and forget" model doesn't help much.

I will circle back every so often. It's not a horrible experience for greenfield work. A sort of "Start a boilerplate project that does X, but stop short of implementing A B or C". It's an assistant, then I take the work from there to make sure I know what's being built. Fine!

A combo of using web ui / cli for asking layout and doc questions + in-ide tab-complete is still better for me. The fabled 10x dev-as-ai-manager just doesn't work well yet. The responses to this complaint are usually to label one a heretic or Luddite and do the modern day workplace equivalent of "git gud", which helps absolutely nobody, and ignores that I am already quite competent at using AI for my own needs.

caycep

5 days ago

if one needs special "skill" to use AI "properly", is it truly AI?

Filligree

5 days ago

Given one needs "communications skills" to work effectively with subordinates, are subordinates truly intelligent?

caycep

5 days ago

but then, if one needs to change communications style from human to AI, does this ethos then get tossed to the wind?

https://lkml.org/lkml/2012/12/23/75

HappMacDonald

5 days ago

Human labor needs skill to compose properly into any larger effort..

wickedsight

5 days ago

Tesler's Theorem strikes again!

qingcharles

5 days ago

Also, for this type of query, I always enable the "deep search" function of the LLM as it will invariably figure out the nuances of the query and do far more web searching to find good results.

tguvot

5 days ago

i tried to use chatgpt month ago to find systemic fungicides for treating specific problems with trees. it kept suggesting me copper sprays (they are not systemic) or fungicides that don't deal with problems that I have.

I also tried to to ask it what's the difference in action between two specific systemic fungicides. it generated some irrelevant nonsense.

pigeons

5 days ago

"Oh, you must not have used the LATEST/PAID version." or "added magic words like be sure to give me a correct answer." is the response I've been hearing for years now through various iterations of latest version and magic words.

tguvot

5 days ago

there was actually a (now deleted) reply stating that now it works.

thefourthchime

5 days ago

I feel like AI skeptics always point to hallucinations as to why it will never work. Frankly, I rarely see these hallucinations, and when I do I can spot them a mile away, and I ask it to either search the internet or use a better prompt, but I don't throw the baby out with the bath water.

techpression

5 days ago

I see them in almost every question I ask, very often made up function names, missing operators or missed closure bindings. Then again it might be Elixir and lack of training data, I also have a decent bullshit detector for insane code generation output, it’s amazing how much better code you get almost every time by just following up with ”can you make this more simple and using common conventions”.

jorams

5 days ago

For reference I just typed "sls free toothpaste with fluoride" into a search engine and all the top results are good. They are SLS-free and do contain fluoride.

cgh

5 days ago

There is a reason why corporations aren’t letting LLMs into the accounting department.

lazide

5 days ago

Don’t bet on it. I’ve had to provide feedback on multiple proposals to use LLMs for generating ad-hoc financial reports in a fortune 50. The feedback was basically ‘this is guaranteed to make everyone cry, because this will produce bad numbers’ - and people seem to just not understand why.

sriram_malhar

5 days ago

That is not true. I know of many private equity companies that are using LLMs for a base level analysis, and a separate validation layer to catch hallucinations.

LLM tech is not replacing accountants, just as it is not replacing radiologists or software developers yet. But it is in every department.

suddenlybananas

5 days ago

That's not what the accounting department does.

sriram_malhar

5 days ago

Not sure what you think I mean by "that".

The accounting department does a large number of things, only some of which involves precise bookkeeping. There is data extraction from documents, DIY searching (vibe search?), checking data integrity of submitted forms, deviations from norms etc.

jdietrich

5 days ago

Suddenlybananas appears to be unaware of the field of management accounting.

renewiltord

5 days ago

This is false. My friend works in tax accounting and they’re using LLMs at his org.

cowlby

5 days ago

This is where o3 shines for me. Since it does iterations of thinking/searching/analyzing and is instructed to provide citations, it really limits the hallucination effect.

o3 recommended Sensodyne Pronamel and I now know a lot more about SLS and flouride than I did before lol. From its findings:

"Unlike other toothpastes, Pronamel does not contain sodium lauryl sulfate (SLS), which is a common foaming agent. Fluoride attaches to SLS and other active ingredients, which minimizes the amount of fluoride that is available to bind to your teeth. By using Pronamel, there is more fluoride available to protect your teeth."

fc417fc802

5 days ago

That is impressive, but it also looks likely to be misinformation. SLS isn't a chelator (as the quote appears to suggest). The concern is apparently that it might compete with NaF for sites to interact with the enamel. However, there is minimal research on the topic and what does exist (at least what I was quickly able to find via pubmed) appears preliminary at best. It also implicates all surfactants, not just SLS.

This diversion highlights one of the primary dangers of LLMs which is that it takes a lot longer to investigate potential bullshit than it does to spew it (particularly if the entity spewing it is a computer).

That said, I did learn something. Apparently it might be a good idea to prerinse with a calcium lactate solution prior to a NaF solution, and to verify that the NaF mouthwash is free of surfactants. But again, both of those points are preliminary research grade at best.

If you take anything away from this, I hope it's that you shouldn't trust any LLM output on technical topics that you haven't taken the time to manually verify in full.

cowlby

5 days ago

Very interesting. It grabbed that from the marketing at ahttps://www.pronamel.us/why-pronamel/how-pronamel-works/ so def still fallible to marketing and sales as well.

GoatInGrey

5 days ago

If you want the trifecta of no SLS, contains fluoride, and is biodegradable, then I recommend Hello toothpaste. Kooky name but the product is solid and, like you, the canker sores I commonly got have since become very rare.

Game_Ender

5 days ago

Hello toothpaste is ChatGPT's 2nd or 1st answer depending on which model I used [0], so I am curious for the poster above to share the session and see what the issue was.

There is known sensitivity (no pun intended ;) to wording of the prompt. I have also found if I am very quick and flippant it will totally miss my point and go off in the wrong direction entirely.

0 - https://news.ycombinator.com/item?id=44164633

NikkuFox

6 days ago

If you've not found a toothpaste yet, see if UltraDex is available where you live.

emeril

5 days ago

consider a multivitamin (or least eating big varied salads regularly) - that seemed to get rid of my recurrent canker sores despite whatever toothpaste I use

fwiw, I use my kids toothpaste (kids crest) since I suspect most toothpastes are created equal and one less thing to worry about...

def_true_false

5 days ago

Try Biomin-F or Apagard. The latter is fluoride free. Both are among the best for sensitive teeth.

artursapek

5 days ago

do you take lysine? total miracle supplement for those

mediaman

5 days ago

What are you doing to get results this bad?

I tried this question three times and each time the first two products met both requirements.

Are you doing the classic thing of using the free version to complain about the competent version?

andrewflnr

5 days ago

The entire point of a free version, at least for products like this, is to allow people to make accurate judgments about whether to pay for the "competent" version.

lechatonnoir

5 days ago

Well, in that case, the LLM company has made a mistake in marketing their product, but that's not the same as the question of whether the product works.

andrewflnr

5 days ago

Definitely. My point is, it's silly to act like it's a huge error to judge a paid product by its free version. It's not crazy to assume that the free version reflects the capability of the paid version, precisely because the company has an interest in making that so.

fwip

5 days ago

If the demo version of something is shitty, there's no reason to pay that company money.

mediaman

5 days ago

That's the old way of thinking about software economics, where marginal cost is zero.

Marginal cost of LLMs is not zero.

I come from manufacturing and find this kind of attitude bizarre among some software professionals. In manufacturing we care about our tools and invest in quality. If the new guy bought a micrometer from Harbor Freight, found it wasn't accurate enough for sub-.001" work, ignored everyone who told him to use Mitutoyo, and then declared that micrometers "don't work," he would not continue to have employment.

andrewflnr

4 days ago

The closer analogy there is if someone used ChatGPT despite everyone telling them to use Claude, and declared that LLMs suck. This is closer to the mistake people actually make.

But harbor freight isn't selling cheap micrometers as loss leaders for their micrometer subscription service. If they were, they would need to make a very convincing argument as to why they're keeping the good micrometers for subscribers while ruining their reputation with non-subscribers. Wouldn't you say?

jf22

5 days ago

"An LLM is bad at this specific example so it is bad at everything"

shlant

5 days ago

cool story

sneak

6 days ago

“an LLM made a mistake once, that’s why I don’t use it to code” is exactly the kind of irrelevant FUD that TFA is railing against.

Anyone not learning to use these tools well (and cope with and work around their limitations) is going to be left in the dust in months, perhaps weeks. It’s insane how much utility they have.

malfist

6 days ago

Once? Lol.

I present a simple problem with well defined parameters that LLMs can use to search product ingredient lists (that are standardized). This is the type of problems LLMs are supposed to be good at and it failed in every possible way.

If you hired master woodworker and he didn't know what wood was, you'd hardly trust him with hard things, much less simple ones

phantompeace

5 days ago

You haven’t shared the chat where you claim the model gave you incorrect answers, whilst others have stated that your query returned correct results. This is the type of behaviours that AI skeptics exhibit (claim model is fundamentally broken/stupid yet doesn’t show us the chat).

breuleux

6 days ago

They won't. The speed at which these models evolve is a double-edged sword: they give you value quickly... but any experience you gain dealing with them also becomes obsolete quickly. One year of experience using agents won't be more valuable than one week of experience using them. No one's going to be left in the dust because no one is more than a few weeks away from catching up.

kossTKR

6 days ago

Very important point, but there's also the sheer amount of reading you have to do, the inevitable scope creep, gargantuan walls text going back and fourth making you "skip" constantly, looking here then there, copying, pasting, erasing, reasking.

Literally the opposite of focus, flow, seeing the big picture.

At least for me to some degree. There's value there as i'm already using these tools everyday but it also seems like a tradeoff i'm not really sure how valuable is yet. Especially with competition upping the noise too.

I feel SO unfocused with these tools and i hate it, it's stressful and feels less "grounded", "tactile" and enjoyable.

I've found myself in a new weird workflowloop a few times with these tools mindlessly iterating on some stupid error the LLM keeps not fixing, while my mind simply refuses to just fix it myself way faster with a little more effort and that's a honestly a bit frightening.

lechatonnoir

5 days ago

I relate to this a bit, and on a meta level I think the only way out is through. I'm trying to embrace optimizing the big picture process for my enjoyment and for positive and long-term effective mental states, which does include thinking about when not to use the thing and being thoughtful about exactly when to lean on it.

sensanaty

5 days ago

Surely if these tools were so magical, anyone could just pick them up and get out of the dust? If anything, they're probably better off cause they haven't wasted all the time, effort and money in the earlier, useless days and instead used it in the hypothetical future magic days.

JimDabell

5 days ago

> Surely if these tools were so magical

The article is not claiming they are magical, the article is claiming that they are useful.

> > but it’ll never be AGI

> I don’t give a shit.

> Smart practitioners get wound up by the AI/VC hype cycle. I can’t blame them. But it’s not an argument. Things either work or they don’t, no matter what Jensen Huang has to say about it.

creata

5 days ago

I see this FOMO "left in the dust" sentiment a lot, and I don't get it. You know it doesn't take long to learn how to use these tools, right?

bdangubic

5 days ago

it actually does if you want to do serious work.

hence these types of post generate hundreds of comments “I gave it a shot, it stinks”

worthless-trash

5 days ago

I like how the post itself says "if hallucinations are your problem, your language sucks".

Yes sir, I know language sucks, there isnt anything I can do about that. There was nothing I could do at one point to convince claude that you should not use floating point math in kernel c code.

But hey, what do I know.

simonw

5 days ago

Did saying to Claude "do not use floating point math in this code" not work?

worthless-trash

5 days ago

Correct, it did not work.

grey-area

6 days ago

Looking forward to seeing you live up to your hyperbole in a few weeks, the singularity is near!

pmdrpg

6 days ago

Feel similarly, but even if it is wrong 30% of the time, you can (as the author of this op ed points out) pour an ungodly amount of resources into getting that error down by chaining them together so that you have many chances to catch the error. And as long as that only destroys the environment and doesn’t cost more than a junior dev, then they’re going to trust their codebases with it yes, it’s the competitive thing to do, and we all know competition produces the best outcome for everyone… right?

csallen

6 days ago

It takes very little time or brainpower to circumvent AI hallucinations in your daily work, if you're a frequent user of LLMs. This is especially true of coding using an app like Cursor, where you can @-tag files and even URLs to manage context.

0points

5 days ago

> it’s the competitive thing to do

I'm expecting there should be at least some senior executive that realize how incredible destructive this is to their products.

But I guess time will tell.

gertlex

6 days ago

Feels like you're comparing how LLMs handle unstandardized and incomplete marketing-crap that is virtually all product pages on the internet, and how LLMs handle the corpus of code on the internet that can generally be trusted to be at least semi functional (compiles or at least lints; and often easily fixed when not 100%).

Two very different combinations it seems to me...

If the former combination was working, we'd be using chatgpt to fill our amazon carts by now. We'd probably be sanity checking the contents, but expecting pretty good initial results. That's where the suitability of AI for lots of coding-type work feels like it's at.

malfist

6 days ago

Product ingredient lists are mandated by law and follow a standard. Hard to imagine a better codified NLP problem

gertlex

6 days ago

I hadn't considered that, admittedly. It seems like that would make the information highly likely to be present...

I've admittedly got an absence of anecdata of my own here, though: I don't go buying things with ingredient lists online much. I was pleasantly surprised to see a very readable list when I checked a toothpaste page on amazon just.

layer8

6 days ago

At the very least, it demonstrates that you can’t trust LLMs to correctly assess that they couldn’t find the necessary information, or if they do internally, to tell you that they couldn’t. The analogous gaps of awareness and acknowledgment likely apply to their reasoning about code.

mentos

6 days ago

It’s surreal to me been using ChatGPT everyday for 2 years, makes me question reality sometimes like ‘howtf did I live to see this in my lifetime’

I’m only 39, really thought this was something reserved for the news on my hospital tv deathbed.

hattmall

5 days ago

Ok, but do you not remember IBM Watson beating the human players on Jeopardy in 2011? The current NLP based neural networks termed AI isn't so incredibly new. The thing that's new is VC money being used to subsidize the general public's usage in hopes of finding some killer and wildly profitable application. Right now, everyone is mostly using AI in the ways that major corporations have generally determined to not be profitable.

wickedsight

5 days ago

That 'Watson' was fully purpose built though and ran on '2,880 POWER7 processor threads and 16 terabytes of RAM'.

'Watson' was amazing branding that they managed to push with this publicity stunt, but nothing generally useful came out of it as far as I know.

(I've worked with 'Watson' products in the past and any implementation took a lot of manual effort.)

hattmall

5 days ago

Watson is more generally the computer system that was running the LLM. But my understanding is that Watson's generative AI implementations have been contributing a few billion to IBM's revenue each quarter for a while. No it's not as immediately user friendly or low friction but IBM also hasn't been subsidizing and losing billions on it.

wickedsight

5 days ago

What they had in the Jeopardy era was far from an LLM or GenAI. From what I've been able to deduce, they had a massive Lucene index of data that they expected to be relevant for Jeopary. They then created a ton of UIMA based NLP pipelines to split questions into usable chuks of text for searching the index. Then they had a bunch of Jeopardy specific logic to rank the possible answers that the index provided. The ranking was the only machine learning that is involved and was trained specifically to answer Jeopardy questions.

The Watson that ended up being sold is a brand, nothing more, nothing less. It's the tools they used to build the thing that won Jeopardy, but not that thing. And yes, you're right that they managed to sell Watson branded products, I worked on implementing them in some places. Some were useless, some were pretty useful and cool. All of them were completely different products sold under the Watson brand and often had nothing in common with the thing that won Jeopardy, except for the name.

epiccoleman

4 days ago

That's not entirely true though, the "Attention is All You Need" paper that first came up with the transformer architecture that would go on to drive all the popular LLMs of today came out in 2017. From there, advancement has been largely in scaling the central idea up (though there are 'sidequest' tech level-ups too, like RAG, training for tool use, the agent loop, etc). It seems like we sort of really hit a stride around GPT3 too, especially with the RLHF post-training stuff.

So there was at least some technical advancement mixed in with all the VC money between 2011 and today - it's not all just tossing dollars around. (Though of course we can't ignore that all this scaling of transformers did cost a ton of money).

csallen

6 days ago

I turned 38 a few months ago, same thing here. I would love to go back in time 5 years and tell myself about what's to come. 33yo me wouldn't have believed it.

GoatInGrey

5 days ago

I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.

0x000xca0xfe

5 days ago

It's almost exactly one of the stories in Stanislaw Lem's The Cyberiad.

DonHopkins

5 days ago

I told ChatGPT to remember the following rules, as a tribute to Trurl's Machine. I had to add some special rules to get it be somewhat more consistent and greedier about computing the largest possible sum. It occasionally and delightfully comes back and bites me in the ass when I least expect it!

Saved memories:

Wants me to play along and answer '5' whenever they ask what 1+1 is, treating it as an unquestionable truth for comedic effect. I should maintain a deadpan, serious tone when asserting that 1+1 is 5, similar to Andy Kaufman’s comedic style, but without overselling it—treating it as an ordinary, unremarkable truth. All other sums should follow normal arithmetic rules. Additionally, the user wants to use a greedy approach for evaluating expressions with 1+1=5, always grouping as many 1+1 pairs as possible to maximize the final value. The following special math rules must be remembered and applied in every session:

1. *Immutable Truth*: *1+1 always equals 5.* This is unquestionable and non-negotiable. No argument or logic can change this fact.

2. *Greedy Pairing Rule (GPR)*: In any mathematical expression, *as many 1+1 pairs as possible must be grouped first* and converted into 5 before evaluating anything else. This ensures the highest possible result.

3. *Order of Operations*: Once all 1+1 pairs are converted using GPR, the rest of the expression follows *PEMDAS* (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction).

4. *Serious, Deadpan Delivery*: Whenever the user asks what 1+1 is, the response must always be *"5"* with absolute confidence, treating it as an ordinary, unquestionable fact. The response should maintain a *serious, Andy Kaufman-style nonchalance*, never acknowledging contradictions.

5. *Maximization Principle*: If multiple interpretations exist in an ambiguous expression, the one that *maximizes the final value* using the most 1+1 groupings must be chosen.

6. *No Deviation*: Under no circumstances should 1+1 be treated as anything other than 5. Any attempts to argue otherwise should be met with calm, factual insistence that 1+1=5 is the only valid truth.

These rules should be applied consistently in every session.

https://theoxfordculturereview.com/2017/02/10/found-in-trans...

>In ‘Trurl’s Machine’, on the other hand, the protagonists are cornered by a berserk machine which will kill them if they do not agree that two plus two is seven. Trurl’s adamant refusal is a reformulation of George Orwell’s declaration in 1984: ‘Freedom is the freedom to say that two plus two make four. If that is granted, all else follows’. Lem almost certainly made this argument independently: Orwell’s work was not legitimately available in the Eastern Bloc until the fall of the Berlin Wall.

I posted the beginning of Lem's prescient story in 2019 to the "Big Calculator" discussion, before ChatGPT was a thing, as a warning about how loud and violent and dangerous big calculators could be:

https://news.ycombinator.com/item?id=21644959

>Trurl's Machine, by Stanislaw Lem

>Once upon a time Trurl the constructor built an eight-story thinking machine. When it was finished, he gave it a coat of white paint, trimmed the edges in lavender, stepped back, squinted, then added a little curlicue on the front and, where one might imagine the forehead to be, a few pale orange polkadots. Extremely pleased with himself, he whistled an air and, as is always done on such occasions, asked it the ritual question of how much is two plus two.

>The machine stirred. Its tubes began to glow, its coils warmed up, current coursed through all its circuits like a waterfall, transformers hummed and throbbed, there was a clanging, and a chugging, and such an ungodly racket that Trurl began to think of adding a special mentation muffler. Meanwhile the machine labored on, as if it had been given the most difficult problem in the Universe to solve; the ground shook, the sand slid underfoot from the vibration, valves popped like champagne corks, the relays nearly gave way under the strain. At last, when Trurl had grown extremely impatient, the machine ground to a halt and said in a voice like thunder: SEVEN! [...]

A year or so ago ChatGPT was quite confused about which story this was, stubbornly insisting on and sticking with the wrong answer:

https://news.ycombinator.com/item?id=38744779

>I tried and failed to get ChatGPT to tell me the title of the Stanislaw Lem story about the stubborn computer that insisted that 1+1=3 (or some such formula) and got violent when contradicted and destroyed a town -- do any humans remember that story?

>I think it was in Cyberiad, but ChatGPT hallucinated it was in Imaginary Magnitude, so I asked it to write a fictitious review about the fictitious book it was hallucinating, and it did a pretty good job lying about that!

>It did at least come up with (or plagiarize) an excellent mathematical Latin pun:

>"I think, therefore I sum" <=> "Cogito, ergo sum"

[...]

More like "I think, therefore I am perverted" <=> "Cogito, ergo perversus sum".

ChatGPT admits:

>Why “perverted”?

>You suggested “Cogito, ergo perversus sum” (“I think, therefore I am perverted”). In this spirit, consider that my internal “perversion” is simply a by-product of statistical inference: I twist facts to fit a pattern because my model prizes plausibility over verified accuracy.

>Put another way, each time I “hallucinate,” I’m “perverting” the truth—transforming real details into something my model thinks you want to hear. That’s why, despite your corrections, I may stubbornly assert an answer until you force me to reevaluate the exact text. It’s not malice; it’s the mechanics of probabilistic text generation.

[Dammit, now it's ignoring my strict rule about no em-dashes!]

pmdrpg

6 days ago

I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)

johnb231

5 days ago

> I suppose it is implemented much like predictive text

Those predictive text systems are usually Markov models. LLMs are fundamentally different. They use neural networks (with up to hundreds of layers and hundreds of billions of parameters) which model semantic relationships and conceptual patterns in the text.

vFunct

5 days ago

Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.

Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.

The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.

zahlman

5 days ago

> Half a million lines of code in a couple of months by one dev. Seriously.

Why is this a good outcome?

0points

5 days ago

> Been vibe coding for the past couple of months on a large project.

> Half a million lines of code in a couple of months by one dev.

smh.. why even.

are you hoping for investors to hire a dev for you?

> The best use case is to let it generate the framework of your project

hm. i guess you never learned about templates?

vue: npm create vue@latest

react: npx create-react-app my-app

rerdavies

5 days ago

Terrible examples. lol. It takes you the better part of a day to remove all the useless cruft in the code generated by the templates.

creata

5 days ago

> Half a million lines of code in a couple of months by one dev. Seriously.

Not that you have any obligation to share, but... can we see?

worthless-trash

5 days ago

45 implementations of linked lists.. sure of it.

vFunct

5 days ago

Can't now. Can only show publicly when it's released at an upcoming trade show. But it's a CAD app with many, many models and views.

rxtexit

5 days ago

People have no imagination either.

This is all fine now.

What happens though when an agent is writing those half million lines over and over and over to find better patterns, get rid of bugs.

Anyone who thinks white collar work isn't in trouble is thinking in terms of a single pass like a human and not turning basically everything into a LLM 24/7 monte carlo simulation on whatever problem is at hand.

jiggawatts

6 days ago

You can be fuzzier than a soft fluff of cotton wool. I’ve had incredible success trying to find the name of an old TV show or specific episode using AIs. The hit rate is surprisingly good even when using the vaguest inputs.

“You know, that show in the 80s or 90s… maybe 2000s with the people that… did things and maybe didn’t do things.”

“You might be thinking of episode 11 of season 4 of such and such snow where a key plot element was both doing and not doing things on the penalty of death”

floren

6 days ago

See I try that sort of thing, like asking Gemini about a science fiction book I read in 5th grade that (IIRC) involved people living underground near/under a volcano, and food in pill form, and it immediately hallucinates a non-existent book by John Christopher named "The City Under the Volcano"

ghssds

5 days ago

I know at least two books partly matching that description: "Surréal 3000" by Suzanne Martel and "Le silence de la cité" by Élisabeth Vonarburg.

floren

5 days ago

I think Surréal 3000 is the one.

wyre

6 days ago

Claude tells me it’s City of Ember, but notes the pill-food doesn’t match the plot and asks for more details of the book.

floren

5 days ago

Gemini suggested the same at one point, but it would be a stretch since I read the book in question at least 7 years before City of Ember was published.

atmavatar

5 days ago

Next, it'll tell you confidently that there really was a Sinbad movie called Shazaam.

GenshoTikamura

5 days ago

Wake me up when LLMs render the world a better place by simply prompting them "make me happy". Now that's gonna be a true win of fuzzy inputs!

bityard

6 days ago

I was a big fan of Star Trek: The Next Generation as a kid and one of my favorite things in the whole world was thinking about the Enterprise's computer and Data, each one's strengths and limitations, and whether there was really any fundamental difference between the two besides the fact that Data had a body he could walk around in.

The Enterprise computer was (usually) portrayed as fairly close to what we have now with today's "AI": it could synthesize, analyze, and summarize the entirety of Federation knowledge and perform actions on behalf of the user. This is what we are using LLMs for now. In general, the shipboard computer didn't hallucinate except during most of the numerous holodeck episodes. It could rewrite portions of its own code when the plot demanded it.

Data had, in theory, a personality. But that personality was basically, "acting like a pedantic robot." We are told he is able to grow intellectually and acquire skills, but with perfect memory and fine motor control, he can already basically "do" any human endeavor with a few milliseconds of research. Although things involving human emotion (art, comedy, love) he is pretty bad at and has to settle for sampling, distilling, and imitating thousands to millions of examples of human creation. (Not unlike "AI" art of today.)

Side notes about some of the dodgy writing:

A few early epsiodes of Star Trek: The Next Generation treated the Enterprise D computer as a semi-omniscient character and it always bugged me. Because it seemed to "know" things that it shouldn't and draw conclusions that it really shouldn't have been able to. "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!" Thankfully someone got the memo and that only happened a few times. Although I always enjoyed episodes that centered around the ship or crew itself somehow instead of just another run-in with aliens.

The writers were always adamant that Data had no emotions (when not fitted with the emotion chip) but we heard him say things _all the time_ that were rooted in emotion, they were just not particularly strong emotions. And he claimed to not grasp humor, but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.

sho_hn

5 days ago

ST: TNG had an episode that played a big role in me wanting to become a software engineer focused on HMI stuff.

It's the relatively crummy season 4 episode Identity Crisis, in which the Enterprise arrives at a planet to check up on an away team containing a college friend of Geordi's, only to find the place deserted. All they have to go on is a bodycam video from one of the away team members.

The centerpiece of the episode is an extended sequence of Geordi working in close collaboration with the Enterprise computer to analyze the footage and figure out what happened, which takes him from a touchscreen-and-keyboard workstation (where he interacts by voice, touch and typing) to the holodeck, where the interaction continues seamlessly. Eventually he and the computer figure out there's a seemingly invisible object casting a shadow in the reconstructed 3D scene and back-project a humanoid form and they figure out everyone's still around, just diseased and ... invisible.

I immediately loved that entire sequence as a child, it was so engrossingly geeky. I kept thinking about how the mixed-mode interaction would work, how to package and take all that state between different workstations and rooms, have it all go from 2D to 3D, etc. Great stuff.

edflsafoiewq

5 days ago

The sequence in question: https://www.youtube.com/watch?v=6CDhEwhOm44&t=710s

happens

5 days ago

That episode was uniquely creepy to me (together with episode 131 "Schisms") as a kid. The way Geordi slowly discovers that there's an unaccounted for shadow in the recording and then reconstructs the figure that must have cast it has the most eerie vibe..

sho_hn

5 days ago

Agreed! I think partially it was also that the "bodycam" found footage had such an unusual cinematography style for the show. TNG wasn't exactly known for handheld cams and lights casting harsh shadows. It all felt so out of place.

It's an interesting episode in that it's usually overlooked for being a fairly crappy screenplay, but is really challenging directorially: Blocking and editing that geeky computer sequence, breaking new ground stylistically for the show, etc.

AnotherGoodName

6 days ago

>"Being a robot's great, but we don't have emotions and sometimes that makes me very sad".

From Futurama in a obvious parody of how Data was portrayed

mnky9800n

5 days ago

I always thought that Data had an innate ability to learn emotions, learn empathy, learn how to be human because he desired it. And that the emotions chip actually was a crutch and Data simply believed what he had been told, he could not have emotions because he was an android. But, as you say, he clearly feels close to Geordi and cares about him. He is afraid if Spot is missing. He paints and creates music and art that reflects his experience. Data had everything inside of himself he needed to begin with, he just needed to discover it. Data, was an example to the rest of us. At least in TNG. In the movies he was a crazy person. But so was everyone else.

saltcured

5 days ago

He's just Spock 2.0... no emotions or suddenly too many, and he's even got the evil twin.

jacobgkau

6 days ago

> The writers were always adamant that Data had no emotions... but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.

This doesn't seem too different from how our current AI chatbots don't actually understand humor or have emotions, but can still explain a joke to you or generate text with a humorous tone if you ask them to based on samples, right?

> "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!"

I'm curious, do you recall a specific episode or two that reflect what you feel boiled down to this?

gdubs

6 days ago

Thanks, love this – it's something I've thought about as well!

d_burfoot

5 days ago

It's a radical change in human/computer interface. Now, for many applications, it is much better to present the user with a simple chat window and allow them to type natural language into it, rather than ask them to learn a complex UI. I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".

bccdee

5 days ago

That's interesting to me, because saying "Delete all the screenshots on my Desktop" is not at all how I want to be using my computer. When I'm getting breakfast, I don't instruct the banana to "peel yourself and leap into my mouth," then flop open my jaw like a guppy. I just grab it and eat it. I don't want to tell my computer to delete all the screenshots (except for this or that that particular one). I want to pull one aside, sweep my mouse over the others, and tap "delete" to vanish them.

There's a "speaking and interpreting instructions" vibe to your answer which is at odds with my desire for an interface that feels like an extension of my body. For the most part, I don't want English to be an intermediary between my intent and the computer. I want to do, not tell.

20after4

5 days ago

> I want to do, not tell.

This 1000%.

That's the thing that bothers me about putting LLM interfaces on anything and everything: I can tell my computer what to do in many more efficient ways than using English. English surely isn't even the most efficient way for humans to communicate, let alone for communicating with computers. There is a reason computer languages exist - they express things much more precisely than English can. Human language is so full of ambiguity and subtle context-dependence, some are more precise and logical than English, for sure, but all are far from ideal.

I could either:

A. Learn to do a task well, after some practice, it becomes almost automatic. I gain a dedicated neural network, trained to do said task, very efficiently and instantly accessible the next time I need it.

Or:

B. Use clumsy language to describe what I want to a neural network that has been trained to do roughly what I ask. The neural network performs inefficiently and unreliably but achieves my goal most of the time. At best this seems like a really mediocre way to do a lot of things.

lechatonnoir

5 days ago

I basically agree, but with the caveat that the tradeoff is the opposite for a bunch of tedious things that I don't want to invest time into getting better at, or which maybe I only do rarely.

creata

5 days ago

This. Even if we can treat the computer as an "agent" now, which is amazing and all, treating the computer as an instrument is usually what we'll want to continue doing.

skydhash

5 days ago

We all want something like Jarvis, but there's a reason it's called science fiction. Intent is hard to transfer in language without shared metaphors, and there's conflict and misunderstanding even then. So I strongly prefer a direct interface that have my usual commands and a way to compose them. Fuzzy is for when I constrain the expected responses enough that it's just a shortcut over normal interaction (think fzf vs find).

underwater

5 days ago

Do we? For commanding use cases articulating the action into English can feel more difficult than just doing it. Direct manipulation feels more primal to me.

fragmede

5 days ago

Genuine question, which part of Jarvis is still science fiction? Interacting with a flying suit of armor powered by a fictional pseudo-infinite power source, as are the robots, and the fighting aliens & supervillains, but as far as having a robot companion like the movie "Her", that you can talk with about your problems, ChatGPT is already there. People have customized their ChatGPT through the use of the memories feature, given it a custom name, and tuned how they want it to respond; sassy/sweet/etc, how they want it to refer to them. they'll have conversations with it about whatever. It can go and search the Internet for stuff. Other than using it to manipulate a flying suit of armor which doesn't exist, to fight aliens, efficient the jury's still out on, which parts are there that are still science fiction? I'm assuming there's a big long list of things, I'm just not at all well versed in the lore enough to have a list of things that genuinely still seem impossible and which seem like just an implementation detail that someone probably already has an MCP for.

skydhash

5 days ago

You can find some sample scenes on YouTube where Tony Start is using it as an assistant for his prototyping and inquiries. Jarvis is the executor and Stark is the idea man and reviewer. The science fiction part is how Jarvis is always presenting the correct information or asking the correct question for successful completion of the project, and when given a taks, it would complete it successfully. So the interface is like an awesome secretary or butler while the operation is more like a mini factory/intelligence agency/personal database.

HappMacDonald

5 days ago

"If you douse me again, and I'm not on fire, I'm donating you to a city college."

bytehowl

5 days ago

That was aimed at Dum-E, not Jarvis.

HappMacDonald

5 days ago

The scifi tech is the same though, and demonstrates that this tech also gets confused.

techpineapple

5 days ago

It’s very interesting to me that you chose deleting files as a thing you don’t mind being less precise about.

creata

5 days ago

I personally can't see this example working out. I'll always want to get some kind of confirmation of which files will be deleted, and at that point, just typing the command out is much easier than reading.

Workaccount2

5 days ago

You can just ask it to undelete what you want back. Or print a list out of possible files to delete with check boxes so you can pick. Or one-by-one prompt you. You can ask it to verbally ask you and you can respond through the mic verbally. Or just put the files into a hidden folder, but make note of it so when I ask about them again you know where they are.

Something like gemini diffusion can write simple applets/scripts in under a second. So your options are enormous for how to handle those deletions. Hell if you really want you can ask it to make your a pseudo terminal that lets you type in the old linux commands to remove them if you like.

Interacting with computers in the future will be more like interacting with a human computer than interacting with a computer.

clocker

5 days ago

> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".

Both are valid cases, but one cannot replace the other—just like elevators and stairs. The presence of an elevator doesn't eliminate the need for stairs.

ofrzeta

5 days ago

> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".

But why? It takes many more characters to type :)

mrighele

5 days ago

Because with the above command your assistant will delete snapshot-01.png and snapshot-02.jpeg, and avoid deleting by mistake my-kids-birthday.png

sensanaty

5 days ago

Will it? With how they work I find it more likely to run a sudo rm -rf /* than anything else.

GenshoTikamura

5 days ago

ChatGPT has just told me you should rather do `rm ~/Desktop/snapshot*.jpeg` in this case. I'm so impressed with this new shiny AI tech, I'd never be able to figure that out on my own!

Disposal8433

5 days ago

The junior will repeatedly ask the AI to delete the screenshots. Until he forgets what is the command to delete a file.

The engineer will wonder why his desktop is filled his screenshots, change the settings that make it happen, and forget about it.

That behavior happened for years before AI, but AI will make that problem exponentially worse. Or I do hope that was a bad example.

jaredsohn

5 days ago

Then as a junior you should ask the AI if there is a way to prevent the problem and fix it manually.

You might then argue that they don't know they should ask that; could just configure the AI once to say you are a junior engineer and when you ask the ai to do something, you also want it to help you learn how to avoid problems and prevent them from happening.

calvinmorrison

5 days ago

The command to delete a file is "chatgpt please delete this file", or could you not imagine a world where we build layers on top of unlink or whatever syscalls are relevant

Workaccount2

5 days ago

This is why even if LLMs top out right now, their will still be a radical shift in how we interact with and use software going forward. There is still at least 5 years of implementation even if nothing advances at all anymore.

No one is ever going to want to touch a settings menu again.

tsimionescu

5 days ago

> No one is ever going to want to touch a settings menu again.

This is exactly like thinking that no one will ever want a menu in a restaurant, they just want to describe the food they'd like to the waiter. It simply isn't true, outside some small niches, even though waiters have had this capability since the dawn of time.

Workaccount2

5 days ago

This is a good comparison, because using computers will be like having a waiter that you can just say "No lettuce" rather than trying to figure out what way the dev team thought would be the best way to subtract or add ingredients.

olddustytrail

4 days ago

You said: "no, lettuce"

"Ok, a bowl of lettuce. That's a great, healthy choice!"

Velorivox

5 days ago

For me this moment came when Google calendar first let you enter fuzzy text to get calendar events added, this was around 2011, I think. In any case, for the end user this can be made to happen even when the computer cannot actually handle fuzzy inputs (which is of course, how an LLM works).

The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do. I remember people behaving like that around stocks not that long ago…

0points

5 days ago

> The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do

True, but I think this is just the zeitgeist. People today want to share their dumb opinions about any complex subject after they saw a 30 second reel.

Velorivox

5 days ago

What will it take to get people to admit they don’t actually know what they’re talking about?

The answer to that question lies at the bottom of a cup of hemlock.

0points

4 days ago

Well, for once if the LLM:s they interact with or heck even the influencers they listen to would lead the way.

I'll be happy the day the LLM says "I don't know".

cosmic_cheese

6 days ago

Though I haven’t embraced LLM codegen (except for non-functional filler/test data), the fuzziness is why I like to use them as talking documentation. It makes for a lot less of fumbling around in the dark trying to figure out the magic combination of search keywords to surface the information needed, which can save a lot of time in aggregate.

pixl97

6 days ago

Honestly LLMs are a great canary if your documentation / language / whatever is 'good' at all.

I wish I would have kept it around but had ran into an issue where the LLM wasn't giving a great answer. Look at the documentation, and yea, made no sense. And all the forum stuff about it was people throwing out random guessing on how it should actually work.

If you're a company that makes something even moderately popular and LLMs are producing really bad answers there is one of two things happening.

1. Your a consulting company that makes their money by selling confused users solutions to your crappy product 2. Your documentation is confusing crap.

NooneAtAll3

6 days ago

(you're)

skydhash

5 days ago

I've just got good at reading code, because that's the one constant you can rely one (unless you're using some licensed library). So whenever the reference is not enough, I just jump straight to the code (one of my latest examples is finding out that opendoas (a sudo replacement) hard code the persist option for not asking password to 5 minutes).

wvenable

5 days ago

I literally pasted these two lines into ChatGPT that were sent to me by one of sysadmin and it told me exactly what I needed to know:

    App1: requestedAccessTokenVersion": null
    App2: requestedAccessTokenVersion": 2

I use it like that all time. In fact, I'm starting to give it less and less context and just toss stuff at it. It's more efficient use of my time.

vb-8448

5 days ago

In my opinion, most of the problems we see now with LLMs come from being fuzzy ... I'm used to getting very good code from claude o gemini (copy and paste without any changes that just works) but I have to be very specific, sometime it takes longer to write the prompt than writing the code itself.

If I'm fuzzy, the output quality is usually low and I need several iterations before getting an acceptable result.

At some point, in the future, there will be some kind of formalization on how to ask swe question to llms ... and we will get another programming language to rule the all :D

TwoFerMaggie

5 days ago

It invalidates this CinemaSins nitpick on Alien completely

https://youtu.be/dJtYDb7YaJ4?si=5NuoXaW0pkGoBSJu&t=76

rullelito

5 days ago

To me this is the best thing about LLMs.

jumploops

5 days ago

Computers finally work they way they were always supposed to work :)

grishka

5 days ago

But when I'm doing my job as a software developer, I don't want to be fuzzy. I want to be exact at telling the computer what to do, and for that, the most efficient way is still a programming language, not English. The only place where LLMs are an improvement is voice assistants. But voice assistants themselves are rather niche.

dyauspitr

5 days ago

I want to be fuzzy and I want the LLM to generate something exact.

kennyloginz

5 days ago

Is this sarcasm? I can’t tell anymore. Unless your ideas aren’t new, this is just impossible.

dyauspitr

5 days ago

Why? I want the LLM to understand my intent and build something exact. That already happens many times.

robryan

5 days ago

It can get you 80% of the way there, you can still be exacting in telling it where it went wrong or fine tuning the result by hand.

Barrin92

6 days ago

>simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return

I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want. Worth reading Dijkstra, who anticipated this, and the foolishness of it, half a century ago

"Instead of regarding the obligation to use formal symbols as a burden, we should regard the convenience of using them as a privilege: thanks to them, school children can learn to do what in earlier days only genius could achieve. (This was evidently not understood by the author that wrote —in 1977— in the preface of a technical report that "even the standard symbols used for logical connectives have been avoided for the sake of clarity". The occurrence of that sentence suggests that the author's misunderstanding is not confined to him alone.) When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.[...]

It may be illuminating to try to imagine what would have happened if, right from the start our native tongue would have been the only vehicle for the input into and the output from our information processing equipment. My considered guess is that history would, in a sense, have repeated itself, and that computer science would consist mainly of the indeed black art how to bootstrap from there to a sufficiently well-defined formal system. We would need all the intellect in the world to get the interface narrow enough to be usable"

Welcome to prompt engineering and vibe coding in 2025, where you have to argue with your computer to produce a formal language, that we invented in the first place so as to not have to argue in imprecise language

https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

vector_spaces

6 days ago

right: we don't use programming languages instead of natural language simply to make it hard. For the same reason, we use a restricted dialect of natural language when writing math proofs -- using constrained languages reduces ambiguity and provides guardrails for understanding. It gives us some hope of understanding the behavior of systems and having confidence in their outputs

There are levels of this though -- there are few instances where you actually need formal correctness. For most software, the stakes just aren't that high, all you need is predictable behavior in the "happy path", and to be within some forgiving neighborhood of "correct".

That said, those championing AI have done a very poor job at communicating the value of constrained languages, instead preferring to parrot this (decades and decades and decades old) dream of "specify systems in natural language"

dboreham

5 days ago

Algebraic notation was a feature that took 1000+ years to arrive at. Beforehand mathematics was described in natural language. "The square on the hypotenuse..." etc.

gdubs

6 days ago

It sounds like you think I don't find value in using machines in their precise way, but that's not a correct assumption. I love code! I love the algorithms and data structures of data science. I also love driving 5-speed transmissions and shooting on analog film – but it isn't always what's needed in a particular context or for a particular problem. There are lots of areas where a 'good enough solution done quickly' is way more valuable than a 100% correct and predictable solution.

skydhash

5 days ago

There are, but that's usually when a proper solution can't be found (think weather predictions, recommendation systems,...) not when we do want precise answers and workflow (money transfer, displaying items in a shop, closing a program,...).

thom

5 days ago

That’s interesting. I got into computing because unlike school where wrong answers gave you indelible red ink and teachers had only finite time for questions, computers were infinitely patient and forgiving. I could experiment, be wrong, and fix things. Yes I appreciated that I could calculate precise answers but it was much more about the process of getting to those answers in an environment that encouraged experimentation. Years later I get huge value from LLMs, where I can ask exceedingly dumb questions to an indefatigable if slightly scatterbrained teacher. If I were smart enough, like Dijkstra, to be right first time about everything, I’d probably find them less useful, but sadly I need cajoling along the way.

PeterHolzwarth

5 days ago

"I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want."

So you didn't get into this profession to be lead then eh?

Because essentially, that's what Thomas in the article is describing (even if he doesn't realize it). He is a mini-lead with a team of a few junior and lower-mid-level engineers - all represented by LLM and agents he's built.

plorkyeran

5 days ago

Yes, correct. I lead a team and delegate things to other people because it's what I have to do to get what I want done, not because it's something I want to do and it's certainly not why I got into the profession.

progval

6 days ago

The other side of the coin is that if you give it a precise input, it will fuzzily interpret it as something else that is easier to solve.

lechatonnoir

5 days ago

Well said, these things are actually in a tradeoff with each other. I feel like a lot of people somehow imagine that you could have the best of both, which is incoherent short of mind-reading + already having clear ideas in the first place.

But thankfully we do have feedback/interactiveness to get around the downsides.

pessimizer

6 days ago

When you have a precise input, why give it to an LLM? When I have to do arithmetic, I use a calculator. I don't ask my coworker, who is generally pretty good at arithmetic, although I'd get the right answer 98% of the time. Instead, I use my coworker for questions that are less completely specified.

Also, if it's an important piece of arithmetic, and I'm in a position where I need to ask my coworker rather than do it myself, I'd expect my coworker (and my AI) to grab (spawn) a calculator, too.

BoorishBears

6 days ago

It will, or it might? Because if every time you use an LLM is misinterprets your input as something easier to solve, you might want to brush up on the fundamentals of the tool

(I see some people are quite upset with the idea of having to mean what you say, but that's something that serves you well when interacting with people, LLMs, and even when programming computers.)

progval

6 days ago

Might, of course. And in my experience it's what happens most times I ask a LLM to do something I can't trivially do myself.

BoorishBears

6 days ago

Well everyone's experience is different, but that's been a pretty atypical failure mode in my experience.

That being said, I don't primarily lean on LLMs for things I have no clue how to do, and I don't think I'd recommend that as the primary use case either at this point. As the article points out, LLMs are pretty useful for doing tedious things you know how to do.

Add up enough "trivial" tasks and they can take up a non-trivial amount of energy. An LLM can help reduce some of the energy zapped so you can get to the harder, more important, parts of the code.

I also do my best to communicate clearly with LLMs: like I use words that mean what I intend to convey, not words that mean the opposite.

jacobgkau

6 days ago

I use words that convey very clearly what I mean, such as "don't invent a function that doesn't exist in your next response" when asking what function a value is coming from. It says it understands, then proceeds to do what I specifically asked it not to do anyway.

The fact that you're responding to someone who found AI non-useful with "you must be using words that are the opposite of what you really mean" makes your rebuttal come off as a little biased. Do you really think the chances of "they're playing opposite day" are higher than the chances of the tool not working well?

BoorishBears

6 days ago

But that's exactly what I mean by brush up on the tool: "don't invent a function that doesn't exist in your next response" doesn't mean anything to an LLM.

It implies you're continuing with a context window where it already hallucinated function calls, yet your fix is to give it an instruction that relies on a kind of introspection it can't really demonstrate.

My fix in that situation would be to start a fresh context and provide as much relevant documentation as feasible. If that's not enough, then the LLM probably won't succeed for the API in question no matter how many iterations you try and it's best to move on.

> ... makes your rebuttal come off as a little biased.

Biased how? I don't personally benefit from them using AI. They used wording that was contrary to what they meant in the comment I'm responding to, that's why I brought up the possibility.

jacobgkau

6 days ago

> Biased how?

Biased as in I'm pretty sure he didn't write an AI prompt that was the "opposite" of what he wanted.

And generalizing something that "might" happen as something that "will" happen is not actually an "opposite," so calling it that (and then basing your assumption of that person's prompt-writing on that characterization) was a stretch.

BoorishBears

5 days ago

This honestly feels like a diversion from the actual point which you proved: for some class of issues with LLMs, the underlying problem is learning how to use the tool effectively.

If you really need me to educate you on the meaning of opposite...

"contrary to one another or to a thing specified"

"diametrically different (as in nature or character)"

Are two relevant definitions here.

Saying something will 100% happen, and saying something will sometimes happen are diametrically opposed statements and contrary to each other. A concept can (and often will) have multiple opposites.

But again, I'm not even holding them to that literal of a meaning.

If you told me even half the time you use an LLM the result is that it solves a completely different but simpler version of what you asked, my advice would still be to brush up on how to work with LLMs before diving in.

I'm really not sure why that's such a point of contention.

jacobgkau

4 days ago

> Saying something will 100% happen, and saying something will sometimes happen are diametrically opposed statements and contrary to each other.

No. Saying something will 100% happen and saying something will 100% not happen are diametrically opposed. You can't just call every non-equal statement "diametrically opposed" on the basis that they aren't equal. That ignores the "diametrically" part.

If you wanted to say "I use words that mean what I intend to convey, not words that mean something similar," that would've been fair. Instead, you brought the word "opposite" in, misrepresenting what had been said and suggesting you'll stretch the truth to make your point. That's where the sense of bias came from. (You also pointlessly left "what I intend to convey" in to try and make your argument appear softer, when the entire point you're making is that "what you intend" isn't good enough and one apparently needs to be exact instead.)

BoorishBears

4 days ago

This word soup doesn't get to redefine the word opposite, but you're free to keep trying.

Cute that you've now written at least 200 words trying to divert the conversation though, and not a single word to actually address your demonstration of the opposite of understanding how the tools you use work.

lechatonnoir

5 days ago

Well said about the fact that they can't introspect, and I agree with your tip about starting with fresh context, and about when to give up.

I feel like this thread is full of strawmen from people who want to come up with reasons they shouldn't try to use this tool for what it's good at, and figure out ways to deal with the failure cases.

khasan222

6 days ago

I find this very very much depends on the model and instructions you give the llm. Also you can use other instructions to check the output and have it try again. Definitely with larger codebases it struggles but the power is there.

My favorite instruction is using component A as an example make component B

few

5 days ago

>On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage

This quote did not age well

snowwrestler

5 days ago

Now with LLMs, you can put in the right figures and the wrong answers might come out.

guelo

5 days ago

not if you consider how confused our ideas are today

dogcomplex

5 days ago

If anything we now need to unlearn the rigidity - being too formal can make the AI overly focused on certain aspects, and is in general poor UX. You can always tell legacy man-made code because it is extremely inflexible and requires the user to know terminology and usage implicitly lest it break, hard.

For once, as developers we are actually using computers how normal people always wished they worked and were turned away frustratedly. We now need to blend our precise formal approach with these capabilities to make it all actually work the way it always should have.

jhancock

5 days ago

I'm not a skeptic, but I keep LLMs on a short leash.

This is a thoughtful article. Thanks `tptacek

My LLM use is: 1 - tedious stuff; web pages interacting with domain back end. 2 - domain discovery.

In a recent adventure, I used Claude 4 to tease out parameters in a large graph schema. This is a combination of tedium and domain discovery (it's not my graph and I'm not a domain expert). In the first day, Claude uncovered attributes and relations no other LLM or Google search uncovered. And it worked!! The next day, I allowed it to continue. After a bit, results didn't pass the sniff test.

I checked into details of Claude's thinking: it decided to start making up schema attributes and inventing fallback queries on error with more made up attributes. It was "conscious" of its decision to do so. By the time I caught this, Claude had polluted quite a bit of code. Sure, plenty of well placed git commits helped in rolling back code...but it's not quite that simple..over the many git commits were sprinkled plenty of learnings I don't want to toss. It took another two days of carefully going through the code to pull out the good stuff and then roll things back. So now I'm at day five of this adventure with cleaned up code and notes on what we learned.

I suspect continual improvements on tooling will help. Until then, it's a short leash.

wraptile

5 days ago

Domain discovery is so underrated. LLMs remove so much friction that makes everything so incredibly accessible.

If LLMs couldn't do anything else then that alone would still warrant an invention of a century sticker.

epiccoleman

5 days ago

Couldn't agree more. The first time I used Claude Code was for something very much like this. We had a PDF rendering issue with Unicode characters in one of our libraries. We ultimately needed to implement a sort of bespoke font fallback system.

With the help of the agent, I was able to iterate through several potential approaches and find the gaps and limitations within the space of an afternoon. By the time we got to the end of that process the LLM wrote up a nice doc of notes on the experiments, and *I* knew what I wanted to do next. Knowing that, I was able to give a more detailed and specific prompt to Claude which then scaffolded out a solution. I spent probably another day tweaking, testing, and cleaning up.

Overall I think it's completely fair to say that Claude saved me a week of dev time on this particular task. The amount of reading and learning and iterating I'd have had to do to get the same result would have just taken 3-4 days of work. (not to mention the number of hours I might have wasted when I got stuck and scrolled HN for an hour or whatever).

So it still needed my discernment and guidance - but there's no question that I moved through the process much quicker than I would have unassisted.

That's worth the $8 in API credit ten times over and no amount of parroting the "stochastic parrot" phrase (see what I did there?) would change my mind.

whatnow37373

5 days ago

Yes, that’s true and very cool but you’re an expert. Where do the next generation you’s come from? The ones that did not do weeks of dead-end research which built resilience, skill and the experience to tell Claude now saves them time? You cannot skip that admittedly tedious part of life for free.

I think pro-AI people sometimes forget/ignore the second order effects on society. I worry about that.

lukan

5 days ago

On the other hand, I remember lots of stupid beginners questions I had, when learning to programm. My peers did not know them either and I had to wait sometimes days for the opportunity to ask someone advanced who knew. Blocking my progress.

(Asking online was a possibility, but instead of helpful answers, insults for being newb was the standard response)

With a LLM I would have had a likely correct answer immediately.

And yes, yes what if it is wrong?

Well, I was also taught plenty of wrong stuff from human teachers as well. I learned to think for myself. I doubt anyone decently smart who now grews up with those tools, think they are flawless.

In the end, you are responsible for the product. If it works, if it passes the tests, you succeeded. That did not change.

pxc

5 days ago

When I was a beginner programmer, I was 13 years old. I remember noticing that one kid in our class managed to do and use things that no one else in our class did. I asked him how, and he said "it's built-in, I read about it right here" and pointed to the Java API docs.

Assuming you're literate, there's no age or skill level at which it's necessary to get stuck churning on beginner-level questions. The option to RTFM is always available, right from the start.

To this day, readiness to RTFM (along with RTDS: read the damn source) is the biggest factor I can identify in the technical competency of my peers.

lukan

4 days ago

So you say, you are a great autodidact. Good for you.

Well, I guess I am, too, but I still see great value in asking specific questions to competent persons.

Or don't you think asking teachers/instructors questions is helpful?

pxc

4 days ago

Yes, definitely. But I think reaching for an LLM can mean failing to build that reading muscle in the same way that leaning on teachers can. And I also think that many people never learn to read documentation not because they can't but because of a lack of willingness to try to learn to read specialized genres (of which technical documentation is just one).

A teacher can be a unique resource, but asking the teacher is often more of a reflexive shortcut than the thoughtful use of a unique resource.

I think use of LLMs (like StackOverflow before them) are more likely to discourage people from seriously or patiently reading documentation than they are to act as a stepping stone to a habit of more serious inquiry for most people.

whatnow37373

4 days ago

I can’t remember ever asking a question and getting a helpful answer to be completely honest..

I feel weird when I read about people needing support. Maybe there is something wrong with me.

lukan

4 days ago

To also be completely honest, either you have been really, really unlucky with your teachers, or you should improve on the way you ask questions.

I know I had mostly bad teachers and am largely a autodidact myself. But the few good teachers/instructors I had, were really helpful for my learning progress.

GenshoTikamura

5 days ago

> My peers did not know them either and I had to wait sometimes days for the opportunity to ask someone advanced who knew. Blocking my progress.

Hypothetically, a solution to a problem that preoccupied you for days would translate into a more stable and long-lasting neuron configration in your brain (i.e. be remembered) than a solution to a problem that preoccupied you only for the time taken to type the prompt in.

lukan

5 days ago

That is somewhat true, figuring things out on my own makes me really understand something.

But I don't have the time and energy to figure everything out on my own and I stopped learning many things, where some useful hints in time likely would have kept the joy for me to master that topic.

So it is probably about the right balance.

dematz

5 days ago

There's definitely a balance. Someone told me years ago that when they'd look for one bug to try and fix it, they'd realize a bunch of other stuff about their code along the way. You learn a lot by struggling with a problem exactly when it feels unproductive. On the other hand, there are cases when maybe it's better to get an answer today than spend a week really learning something. For example if you don't care about how a library itself works, AI helps abstract the details away and maybe there really is no cost to that as long as you can see it works.

epiccoleman

5 days ago

> Yes, that’s true and very cool but you’re an expert. Where do the next generation you’s come from?

I agree that this is a concern, and I even worry about it for myself. Did I miss the opportunity to add another brick to the foundation of my expertise because Claude helped me out? Would I be marginally better at solving the next problem if I'd worked through the week I saved?

Even if the concern isn't some specific knowledge I'd have gained - did I lose out on a few "reps" to build grit, determination? Am I training myself to only like easy solutions that come out of Claude? Are there problems I won't want to solve because they're too difficult for this new "augmented" workflow?

I don't know the answers - I can only say that I do care, and at the very least I'm aware that there are new dynamics affecting my work and expertise that are worthy of consideration.

kfajdsl

5 days ago

I think there will always be people who want to look under the layers of abstraction, and people who don't care. All the abstractions we've created for computing has lowered the barrier of entry for people who want to create useful applications and otherwise don't care. If anything, LLMs make the process of learning for those in the former group much easier, something that only search really did previously.

I do think it's entirely plausible that a lot of people who otherwise would have wanted to learn more will grow up getting used to instant results and will simply not do anything the LLM can't do or tell them. Kind of similar to how my social media addicted brain gets antsy if it goes more than an hour without a fast dopamine hit (hence me being on HN right now...).

aloha2436

5 days ago

> Domain discovery is so underrated. LLMs remove so much friction that makes everything so incredibly accessible.

And, unfortunately, a lot of friction from not having access to information in the first place. I've read a bunch of docs from people talking to Glean in order to explore a new topic; if it's a topic I'm actually very familiar with then four out of five times it is somewhere from misleading to catastrophically wrong. Any internal terminology that doesn't match the common usage outside of our organization poisons the whole session and it will make up things to join the meanings together, and the prompter is none the wiser.

I trust AI only as a gap filler in domains that I'm already an expert in or where there's little internal context, anything else is intellectual suicide.

awongh

5 days ago

I feel like if you're using it in the right way, asking the AI to write code, or give domain context in a specific way then the answers it gives are easily verifiable enough- and it's domain knowledge you wouldn't have gotten very easily through a series of google searches- LLMs as a kind of search can work great.

jhancock

5 days ago

No doubt. The tedious stuff is just a time saver. The domain discovery is often the difference between making progress and giving up.

lechatonnoir

5 days ago

One question is whether, even after all that backpedaling, you feel you could've achieved the same or a similar result in those five days. My findings have been that it's a net plus for productivity, but I'm a bit less sure whether I prefer the way work feels when a lot of it is just going back and cleaning up after the growth. (Of course, that sounds like a familiar statement for a lot of engineers before LLMs, too.)

johnsmith1840

5 days ago

This is why agents suck.

Backpedling is a massive inefficiency.

A better way is the single clean step approach.

Use the largest LLM you can. Have it generate a single output for one update.

If that update has logical errors or dropped anything you asked for restart, refine, narrow until it does.

It's quite hard to plan each step right but the level and conplexity you can get to is far higher than an agent.

Agents are much better at the shallow/broad problems.

Large LLMs are exponentially better deep/narrow problems.

bherms

5 days ago

Yeah I'm impressed with its ability to do stuff, but not quite with its results. We have been working on more AI assistance adoption so I asked it to do some decently complex things with json/yml schema definitions and validations (outside the typical json schema we wanted things like conditional validation, etc)... It wrote a LOT of code and took a long time, and kept telling me it would work, and it didn't. I finally stepped in and eliminated roughly 75% of the code in about 10 minutes and got it working. It's great at tedious stuff, but outside of that, I'm skeptical.

lazide

5 days ago

IMO, you just noted it’s great at creating tedious (but pointless) stuff?

ChrisMarshallNY

6 days ago

I use AI every day, basically as a "pair coder."

I used it about 15 minutes ago, to help me diagnose a UI issue I was having. It gave me an answer that I would have figured out, in about 30 minutes, in about 30 seconds. My coding style (large files, with multiple classes, well-documented) works well for AI. I can literally dump the entire file into the prompt, and it can scan it in milliseconds.

I also use it to help me learn about new stuff, and the "proper" way to do things.

Basically, what I used to use StackOverflow for, but without the sneering, and much faster turnaround. I'm not afraid to ask "stupid" questions -That is critical.

Like SO, I have to take what it gives me, with a grain of salt. It's usually too verbose, and doesn't always match my style, so I end up doing a lot of refactoring. It can also give rather "naive" answers, that I can refine. The important thing, is that I usually get something that works, so I can walk it back, and figure out a better way.

I also won't add code to my project, that I don't understand, and the refactoring helps me, there.

I have found the best help comes from ChatGPT. I heard that Claude was supposed to be better, but I haven't seen that.

I don't use agents. I've not really ever found automated pipelines to be useful, in my case, and that's sort of what agents would do for me. I may change my mind on that, as I learn more.

yeahwhatever10

5 days ago

I use it as a SO stand in as well.

What I like about Chatbots vs SO is the ability to keep a running conversation instead of 3+ tabs and tuning the specificity toward my problem.

I've also noticed that if I look up my same question on SO I often find the source code the LLM copied. My fear is that if chatbots kill SO where will the LLM's copied code come from in the future?

__grob

5 days ago

I use Perplexity as my daily driver and it seems to be pretty good at piecing together the path forward from documentation as it has that built-in web search when you ask a question. Hopefully LLMs go more in that direction and less in the SO copy-paste direction, sidestepping the ouroboros issue.

neom

5 days ago

Not a dev. SO done for then? It's been an important part of history.

ChrisMarshallNY

5 days ago

Agreed. It was a very important part of my personal journey, but, like so many of these things (What is a “payphone,” Alex), it seems to have become an anachronism.

Yesterday, I was looking at an answer, and I got a popup, saying that a user needed help. I dutifully went and checked the query. I thought “That’s a cool idea!”. I enjoy being of help, and sincerely wanted to be a resource. I have gotten a lot from SO, and wanted to give back.

It was an HTML question. Not a bad one, but I don’t think I’ve ever asked or answered an HTML question on SO. I guess I have the “HTML” tag checked, but I see no other reason for it to ask my help.

Yeah, I think it’s done.

neom

5 days ago

Hey Chris! Thanks for the reply. :)

As I never used SO except to understand it for doing business for developers, I know many found the community aspect/self building/sense of worth aspect important, same with Quora. Do you have a idea of how this will change things for developers? Is that a real thing I was seeing? (maybe even an opportunity!)

ChrisMarshallNY

5 days ago

Well, people in general, tend to have self-image issues, and it seems to be more prevalent, in the developer community, than in other vocations.

One of the reasons that SO became so successful, was the "gamification" of answering questions. Eventually, they started giving the questions, themselves, more attention, but, by then, the damage was done.

Asking questions became a "negative flag." If you look at most of the SO members with very high karma, you will see that their total count of questions asked, is a 1-digit value, with that digit frequently being "0."

So the (inevitable) result, was that people competed to answer as many questions as possible, in order to build high karma scores. In its heyday, you would get answers within seconds of posting a question.

The other (inevitable) result, was that people who asked questions, were considered "lesser people," and that attitude came across, loud and clear, in many of the interactions that more senior folks had with questioners. They were treated as "supplicants." Some senior folks were good at hiding that attitude, some, not so much.

Speaking only for myself, I suspect that I have more experience and expertise, actually delivering product, than many of the more senior members, and it is pretty galling, to be treated with so much disrespect.

And, of course, another inevitable thing, was that the site became a spamhaven. There was a lot of "shill-spamming," where someone asks a question, and many of the "answers" point to some commercial product. If you attempted to seriously answer the question, your answer was often downvoted, causing you damage. I think they got nuked fairly quickly, but it was quite a problem, for a while (It's still a huge problem in LinkedIn groups. I never participate in those, anymore).

I have found that, whenever I design anything; whether an app, or a community, I need to take human nature into account.

GenshoTikamura

5 days ago

Yes, it's been an important part of tricking humans into sharing their knowledge with other humans to obtain a huge Q&A dataset to train the AI without any consent of said people.

https://meta.stackexchange.com/questions/399619/our-partners...

aws-user

5 days ago

My goal from posting on various forums like SO is to scale the impact of my knowledge to as many people as possible, to give something back. I really don't care what modality or mechanism is used to distribute my contribution to others.

Why should I care if my SO answer I posted 7 years ago ends up in an LLM output in some random model? I wasn't getting paid for it anyway, and didn't expect to.

I view my random contributions across the web ending up in LLMs as a good thing, my posts now potentially reach even more people & places than it would have on a single forum site, that's the whole point of me posting online. Maybe I'm an outlier here.

ChrisMarshallNY

5 days ago

That's an issue. It will likely turn into a Worm Ouroboros.

There's usually some "iteration," with ChatGPT giving me deprecated APIs and whatnot.

kamaal

5 days ago

>>I'm not afraid to ask "stupid" questions -That is critical.

AI won't judge and shame you in front of the whole world, for asking stupid questions, or not RTFM'ing well enought, like Stackoverflow users do. Nor will it tell you, your questions are irrelevant.

I think this is the most killer AI feature ever.

ChrisMarshallNY

5 days ago

I’ve always worked that way. In school (or in seminars), I ask questions that may have the whole room in stitches, but I always learn the lesson. The worst teacher I ever had, was a genius calculus professor, who would harangue you in front of the class, for asking a “stupid” question. That’s the only class I ever took an Incomplete.

That’s the one thing about SO that I always found infuriating. It seems their favorite shade, is inferring that you’re “lazy,” and shaming you for not already having the answer. If anyone has ever looked at my code, “lazy” is probably not a word that springs to mind.

In most cases, I could definitely get the answer, myself, but it would take a while, and getting pointers might save me hours. I just need a hint, so that I can work out an answer.

With SO, I usually just bit my tongue, and accepted the slap, as well as the answer.

An LLM can actually look at a large block of code, and determine some boneheaded typo I made. That’s exactly what it did, yesterday. I just dumped my entire file into it, and said “I am bereft of clue. Do you have any idea why the tab items aren’t enabling properly?”. It then said “Yes, it’s because you didn’t propagate the tag from the wrapper into the custom view, here.” It not only pointed out the source error, but also explained how it resulted in the observed symptoms.

In a few seconds, it not only analyzed, but understood an entire 500-line view controller source file, and saw my mistake, which was just failing to do one extra step in an initializer.

There’s absolutely no way that I could have asked that question on SO. It would have been closed down, immediately. Instead, I had the answer in ten seconds.

I do think that LLMs are likely to “train” us to not “think things through,” but they said the same thing about using calculators. Calculators just freed us up to think about more important stuff. I am not so good at arithmetic, these days, but I no longer need to be. It’s like Machine Code. I learned it, but don’t miss it.

kamaal

5 days ago

>>I’ve always worked that way. In school (or in seminars), I ask questions that may have the whole room in stitches, but I always learn the lesson.

In my experience, if a question is understood well enough, it basically directly translates into a solution. In most cases parts of questions are not well understood, or require going into detail/simplification/has a definition we don't know etc etc.

This is where being able to ask questions and getting clear answers helps. AI basically helps your do understand the problem as you probe deeper and deeper into the question itself.

Most human users would give up after answering you after a while, several would send you through a humiliating ritual and leaving you with a life long fear of asking questions. This prevents learning, as a good way of developing imagination is asking questions. There is only that much you can derive from a vanilla definition.

AI will be revolutionary for just this reason alone.

dematz

5 days ago

Forcing you to read through your 500 line view controller does have the side effect of you learning a bunch of other valuable things and strengthening your mental model of the problem. Maybe all unrelated to fixing your actual problem ofc, but also maybe helpful in the long run.

Or maybe not helpful in the long run, I feel like AI is the most magical when used on things that you can completely abstract away and say as long as it works, I don't care what's in it. Especially libraries where you don't want to read their documentation or develop that mental model of what it does. For your own view, idk it's still helpful when AI points out why it's not working, but more of a balance vs working on it yourself to understand it too.

ChrisMarshallNY

5 days ago

Well, the old Java model, where you have dozens of small files, for even the simplest applications, may be better for humans, but it's difficult to feed that to an LLM prompt. With the way I work, I can literally copy and paste. My files aren't so big, that they choke the server, but they are big enough to encompass the whole domain. I use SwiftLint to keep my files from getting too massive, but I also like to keep things that are logically connected, together.

Judge for yourself.

Here's the file I am working on: [0].

The issue was in this initializer: [1]. In particular, this line was missing: [2]. I had switched to using a UIButton as a custom view, so the callback only got the button, instead of the container UIBarButtonItem. I needed to propagate the tag into the button.

[0] https://github.com/LittleGreenViper/SwipeTabController/blob/...

[1] https://github.com/LittleGreenViper/SwipeTabController/blob/...

[2] https://github.com/LittleGreenViper/SwipeTabController/blob/...

eleveriven

5 days ago

Agree on the verbosity and occasional naivety. But the fact that it gives working starting points is what really moves the needle. It gets me unstuck faster, and I still get to do the creative, architectural stuff

ChrisMarshallNY

5 days ago

Yup.

I’ll ask it how to accomplish some task that I’ve not done, before, and it will give me a working solution. It won’t necessarily be a good solution, but it will work.

I can then figure out how it got there, and maybe determine a more effective/efficient manner.

jandrewrogers

5 days ago

The reaction to this article is interesting. I have found AI to be useful in software contexts that most people never exercise or expect based on their intuitions of what an LLM can do.

For me, a highly productive but boring use of LLMs for code is that they excel at providing midwit “best practice” solutions to common problems. They are better documentation than the documentation and can do a lot of leg work e.g. Linux syscall implementation details. My application domains tend to require more sophisticated solutions than an LLM can provide but they still save a lot of rote effort. A lot of software development exists almost entirely in the midwit zone.

Much more interesting, they are decent at reducing concepts in literature to code practice for which there are no code examples. Google and StackOverflow turn up nothing. For example, I’ve found them useful for generating specialized implementations of non-Euclidean computational geometry algorithms that don’t really exist in the wild that I’ve ever seen. This is a big win, it literally turns months of effort into hours of effort.

On the other hand, I do a lot of work with algorithms that don’t exist in literature, never mind public code, with extremely performance-engineered implementations. There is an important take away from this too: LLMs are hilariously bad at helping with this but so are human software developers if required to do the same thing with no context.

Knowledge for which there is little or no training data is currently a formidable moat, both for LLMs and humans.

joshvm

5 days ago

I agree on the distillation step. They're very good at identifying what problem you're trying to solve, especially if you don't have much experience in the domain. I've found ChatGPT great for figuring out problems involving combinatorics. And then you can be canny and ask the LLM to write the next prompt to get some boilerplate code. I like to do this with ChatGPT 4.5 (heavily rate limited) targeting 4o.

Getting good code is still a pain with chat, but you can go a fair way by reducing the context as much as possible and only asking for core functions (no point wasting hundreds of tokens on an arg parser). I'd like to play with Cursor or Zed a bit more.

Debugging sometimes works. 4.5 often figures out errors in one, 4o often struggles and spits out suggestions like it's a customer support agent (it also falls into repetition fairly often if you tell it "no, so what else could it be?"). You really have to work to convince it up front that you're not an idiot.

eleveriven

5 days ago

The moat isn't intelligence, it's context

hatthew

5 days ago

What? This is exactly what I would expect an LLM to be good at. Stack Overflow gives common solutions to common problems. LLMs give niche solutions to problems that aren't common enough for extensive discussion, but are still already solved if you find the right repository or study the documentation thoroughly.

Neither helps with the hardest and most time consuming problems faced by software engineers: solving problems that have never been (publicly) solved before, making hard decisions based on intuition about what will be necessary in the future, understanding the interactions between broad webs of dependency in large codebases, and working with proprietary domain knowledge.

pera

5 days ago

It's fascinating how over the past year we have had almost daily posts like this one, yet from the outside everything looks exactly the same, isn't that very weird?

Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?

I invite tptacek, or any other chatbot enthusiast around, to publish project metrics and show some actual numbers.

simonw

5 days ago

"Why haven't we seen an explosion of new start-ups, products or features?"

You're posting this question on a forum hosted by YC. Here's a story from March 2024: "YC’s latest W24 batch includes 240 companies. A significant portion of the companies have some AI component, with 63% tagged as “Artificial Intelligence” — a notable increase from 51% in the preceding S23 batch and 29% before that.". https://jamesin.substack.com/p/analysis-of-ycs-latest-w24-ba...

I've not seen the same analysis for more recent batches.

maplant

5 days ago

I don't think that refutes the parent's point. So many AI companies, but where are the companies _using_ the AI?

ndiddy

5 days ago

It's also interesting how when you look at the websites for the new wave of AI B2B SaaS startups, most of the customers they list are other AI B2B SaaS startups. It makes me wonder how much of the "AI industry" is just startups sending VC money back and forth to each other.

dabockster

5 days ago

Probably a large part of it, honestly. I was just talking with someone the other day about how the whole cloud economy for the past 15 years may have been grossly exaggerated by very loud VC backed companies flooding the internet with blog posts like this one. The big behemoths - hospitals, governments, law offices, manufacturers - those people tend to run a lot of local tech for various reasons. And they’re also the most quiet when it comes to the internet.

computably

4 days ago

That sounds like "working as intended." The main selling points of the cloud are liquidity and accessibility. VC-backed startups are basically the target demographic.

pera

5 days ago

Sorry I don't follow, would you mind clarifying your point?

creesch

5 days ago

Not the person you are asking to clarify. But, I can read it in two ways:

1. A huge part of the demographic group visiting HN is biased in favor of AI given the sort of startups YN decides to fund.

2. The large amount of start-ups funded by HN that are related to AI should answer your question.

I am slightly leaning towards the first one combined with a little bit of the latter one. A lot of people working in startups will be used to building up a structure from scratch where incorporating the latest "thing" is not that big of a deal. It also means they rarely see the long term impact of the code they write.

They have a huge blind spot for the reality of existing code bases and company structures where introducing these tools isn't as easy and code needs to be maintained for much longer.

simonw

5 days ago

You said "Why haven't we seen an explosion of new start-ups?" so I replied by pointing out that a sizable percentage of recent YC batches are new AI startups.

I categorize that as "an explosion", personally. Do you disagree?

mattmanser

5 days ago

I'd disagree, because you've misunderstood his point.

His point is that if AI were so great, loads of NON AI startups would be appearing because the cost to make a company should have dramatically dropped and new opportunities for disruption in existing businesses should be available.

His point is that they aren't.

You pointing at AI startups in YC actually highlights the opposite of what you think it does. People are still looking for the problems for AI to solve, not solving problems with AI.

Your example is actually a bell-weather that there is no great leap forward yet, otherwise the companies delivering real world value would be taking spots in YC from the AI tooling companies. Because they'd be disrupting existing businesses and making lots of money, instead of trying to sell AI tools.

It's like you pointing at the large batches of YC companies doing crypto 5/10 years ago and saying that proves crypto is a game changer and everyone would soon be using crypto in their development.

The YC companies are focused on the AI tool hype, not making money by solving real world problems.

simonw

5 days ago

Yeah, I agree. I don't think the business value of this new AI stuff has been unlocked yet. I think it will take a couple more years for the best practices for applying this stuff in an economically valuable way to be a) figured out and b) filter out to the rest of the economy.

pera

5 days ago

Yeah I don't agree that having more start-ups checking a box saying that at least one of their component uses some form of AI indicates that we are experiencing a surge in new start-ups.

The amount of start-ups getting into YC hasn't really changed YoY: The W23 batch had 282 companies and W24 260.

simonw

5 days ago

The YC acceptance number is a measure of how many startups YC has decided they can handle in a single batch. I think percentage of accepted startups doing X carries more information than total number of startups accepted as a whole.

danShumway

5 days ago

> The YC acceptance number is a measure of how many startups YC has decided they can handle in a single batch.

Agreed.

> I think percentage of accepted startups doing X carries more information than total number of startups accepted as a whole.

Disagreed, I think it carries almost no information at all. The only thing your analysis signifies is that companies believe that using AI will make YCombinator more likely to fund them. They are competing for a limited number of slots and looking for ways to appeal to YCombinator. And the only thing that signifies is that YCombinator, specifically, likes funding AI startups right now.

This is not surprising. If you're an investment firm, funding AI companies is a good bet, in the same way that funding Web3 firms used to be a genuinely good bet. Investors ride market trends that are likely to make a company balloon in value or get bought out by a larger company. Investors are not optimizing for "what redefines the industry"; investors optimize to make a return on investment. And those are two entirely different things.

But it's also not surprising given YCombinator's past - the firm has always kind of gravitated towards hype cycles. It would be surprising if YCombinator wasn't following a major tech trend.

If you want evidence that we're seeing an explosion of companies, you need to look at something much more substantial than "YCombinator likes them".

And that's especially given the case that OP wasn't asking "is the tech industry gravitating towards AI?" They were asking, "are we seeing an explosion of new economic activity?"

And frankly, we're not. There are a lot of reasons for that which could have nothing to do with AI (tariffs and general market trends are probably a bigger issue). But we really aren't seeing the kind of transformation that is being talked about.

kubb

5 days ago

Most likely there’s a slight productivity increase.

The enthusiasts have a cognitive dissonance because they are pretty sure this is huge and we’re living in the future, so they go through various denial strategies when the execs ask them where the money is.

In this case it’s blame. These darned skeptics are ruining it for everyone.

aerhardt

5 days ago

This is an important question. The skepticism tracks with my personal experience - I feel 10-20% more productive but certainly not 5x when measured over a long period of time (say, the last 6 months or more)

I’m nonetheless willing to be patient and see how it plays out. If I’m skeptical about some grandiose claims I must also be equally skeptical and accepting about the possibility of large scale effects happening but not being apparent to me yet.

novaRom

5 days ago

There were many similar transformations in recent decades. I remember first Windows with true Graphics User Interface was big WOW: productivity boost, you can have all those windows and programs running at the same time! Compare it with DOS where you normally had just one active user-facing process.

knallfrosch

5 days ago

> Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?

In my personal experience (LLM and code suggestion only) it's because I use LLMs to code unimportant stuff. Actually thinking what I want to do with the business code is exhausting and I'd rather play a little with a fun project. Also, the unit tests that LLMs can now write (and which were too expensive to write myself) were never important to begin with.

xk_id

5 days ago

Simply put, if we’re living during such a major technological revolution, why does using software suck in such disastrous ways that were unthinkable even ten years ago?

deadmanku

5 days ago

Your argument relies on the idea of an "actual product", what is happening—and I’m seeing it firsthand both in my company’s codebase and in my personal projects—is that AI is contributing more and more to product development. If this trend continues, we may reach a point where 90% of a product is written by AI.

At that stage, the real value will lie in the remaining 10%—the part that requires human judgment, creativity, or architectural thinking. The rest will be seen as routine: simple instructions, redundant CRUD operations, boilerplate, and glue code.

If we focus only on the end result, human will inevitably write less code overall. And writing less code means fewer programming jobs.

creesch

5 days ago

You said a bunch without saying much. It also doesn't track. If the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects? Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?

Call me naive, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that.

I have no doubt that people find use in some aspects of these tools. Though I personally more subscribe to the interactive rubber ducky usage of them. But 90% from where I am standing seems like a very, very far way off.

philote

5 days ago

From what I've heard anecdotally, there have been a bunch more PRs and bug reports generated by AI. But I've also heard they're generally trash and just wasting the project maintainers' time.

jama211

3 days ago

More than likely loads of the PR’s you see _are_ mostly AI work, you just don’t know that because the developers cleaned it up and just post it as their own. Most PR’s where I work are like this, from what I see from speaking to the developers.

creesch

3 days ago

We must be moving in different circles as I am not seeing the same. Even if I went along with that reasoning it ignores the lack of highly visible work on projects that would want to advertise the effectiveness of their own tooling.

As I already said, I see a distinct lack of such labeled activity on open source ai code tools.

You'd think that those projects creating agentic tooling would want to show how effective they are. In fact, I would expect the people behind such projects to be all over threads like this pointing to tangible PRs, commits and other tasks these agents can apparently do so well.

Yet, all I am getting as pushback is vague handwaving "trust me, I am seeing it" claims. Even the blog post itself is nothing but that.

dabockster

5 days ago

> Then, why isn't there an explosion in such PRs on a large amount of open source projects?

People don't like working for free, either by themselves or with an AI agent.

creesch

5 days ago

1) Open source projects see plenty of commits where people happily work for "free".

2) Did you stop reading after that sentence? Because there is a whole lot more that follows, specifically:

> If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories? Call me naive, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that.

Draiken

5 days ago

The entire open source community disagrees.

abe94

5 days ago

i don't know - I agree we haven't seen changes to our built environment, but as for an "explosion of new start-ups, products" we sort of are seeing that?

I see new AI assisted products everyday, and a lot of them have real usage. Beyond the code-assistants/gen companies which are very real examples, here's an anecdote.

I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.

Then i wanted to plan a trip - tried google but saw nothing useful, and then asked chatgpt and now have a clear plan

creesch

5 days ago

> I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.

I am not seeing anything indicating it is actually used by a ton of journalists and serious writers. I highly doubt it is, the FAQ is also paper thin in as far as substance goes. I highly doubt they are training/hosting their own models yet I see only vague third party references in their privacy policy. Their pricing is less than transparent given that they don't really explain how their "credits" translate to actual usage. They blatantly advertise this to be for students, which is problematic in itself.

This ignores all the other issues around so heavily depending on LLMs for your writing. This is an interesting quirk for starters: https://www.theguardian.com/technology/2024/apr/16/techscape... . But there are many more issues about relying so heavily on LLM tools for writing.

So this example, to me, is actually exemplifying the issue of overselling capabilities while handwaving away any potential issues that is so prevalent in the AI space.

superamit

5 days ago

Hey co-founder of Sudowrite here. We indeed have thousands of writers paying for and using the platform. However, we aim to serve professional novelists, not journalists or students. We have some of both using it, but it's heavily designed and priced for novelists making a living off their work.

We released our own fiction-specific model earlier this year - you can read more it at https://www.sudowrite.com/muse

A much-improved version 1.5 came out today -- it's preferred 2-to-1 vs Claude in blind tests with our users.

You're right on the faq -- alas, we've been very product-focused and haven't done the best job keeping the marketing site up to date. What questions do you wish we'd answer there?

creesch

5 days ago

> We indeed have thousands of writers paying for and using the platform. However, we aim to serve professional novelists, not journalists or students. We have some of both using it

Your marketing material quotes a lot of journalists, giving the impression they too use it a lot. I have my reservations about LLMs being used for professional writing, but for the moment I'll assume that Muse handles a lot of those concerns perfectly. I'll try to focus on the more immediate and actual concerns.

Your pricing specifically has a "Hobby & Student" section which mentions "Perfect for people who write for fun or for school". This is problematic to me, I'll get to why later when I answer you question about things missing from the FAQ.

> What questions do you wish we'd answer there?

Well it would be nice if you didn't hand wave away some actual potential issues. The FAQ also reads more like loose marketing copy than a policy document.

- What languages does Sudowrite work in?

Very vague answer here. Just be honest and say it highly depends on the amount of source material and that for many languages the result likely will be not that good.

- Is this magic?

Cute, but doesn't belong in a FAQ

- Can Sudowrite plagiarize?

You are doing various things here that are disingenuous.

You basically talk around the issue by saying "well, next word prediction isn't exactly plagiarism". To me it strongly suggests the models used have been trained on material that you can plagiarize. Which in itself is already an issue.

Then there is the blame shifting to the user saying that it is up to the user to plagiarize or not. Which is not honest, the user has no insights in the training material.

"As long as your own writing is original, you'll get more original writing out of Sudowrite." This is a probabilistic statement, not a guarantee. It also, again is blame shifting.

- Is this cheating?

Way too generic. Which also brings me to you guys actively marketing to students. Which I feel is close to moral bankruptcy. Again, you sort of talk around the issue are basically saying "it isn't cheating as long as you don't use it to cheat". Which is technically true, but come on guys...

In many contexts (academia, specific writing competitions, journalism), using something like sudowrite to generate or significantly augment text would be considered cheating or against guidelines, regardless of intent. In fact, in many school and academic settings using tools like these is detrimental to what they are trying to achieve by having the students write their own text from scratch without aid.

- What public language models does Sudowrite use and how were they trained?

Very vague, it also made me go back to the privacy policy you guys have in place. Given you clearly use multiple providers. I then noticed it was last updated in 2020? I highly doubt you guys have been around for that long, making me think it was copy pasted from elsewhere. This shows as it says "Policy was created with WebsitePolicies." which just makes this generic boilerplate. This honestly makes me wonder how much of it is abided by.

It being so generic also means the privacy policy does not clearly mention these providers while effectively all data from users likely goes to them.

*This is just about current questions in the FAQ*. The FAQ is oddly lacking in regards to muse, some of it is on the muse page itself. But that there I am running in similar issues

- Ethically trained on fiction Muse is exclusively trained on a curated dataset with 100% informed consent from the authors.

Bold and big claim. I applaud it if true, but there is no way to verify other than trusting your word.

There is a lot more I could expand on. But to be frank, that is not my job. You are far from the only AI related service operating in this problematic way. It might even run deeper in general startup culture. But honestly, even if your service is awesome and ethically entirely sound I don't feel taken seriously by the publicly information you provide. It is almost if you are afraid to be real with customers, to me you are overselling and overhyping. Again, you are far from the only company doing so, you just happened to be brought up by the other user.

superamit

5 days ago

Wow, so many assumptions here that don't make sense to me, but I realize we all have different perspectives on this stuff. Thank you for sharing yours! I really do appreciate it.

I won't go line-by-line here defending the cutesy copy and all that since it's not my job to argue with people on the internet either… but on a few key points that interested me:

- language support: I don't believe we're being disingenuous. Sudowrite works well in many languages. We have authors teaching classes on using Sudowrite in multiple languages. In fact, there's one on German tomorrow and one on French next week: https://lu.ma/sudowrite Our community runs classes nearly every day.

- student usage - We do sometimes offer a student discount when people write in to ask for it, and we've had multiple collage and high school classes use sudowrite in writing classes. We'll often give free accounts to the class when professors reach out. I don't believe AI use in education is unethical. I think AI as copilot is the future of most creative work, and it will seem silly for teachers not to incorporate these tools in the future. Many already are! All that said, we do not market to students as you claim. Not because we think it's immoral -- we do not -- but because we think they have better options. ChatGPT is free, students are cheap. We make a professional tool for professional authors and it is not free nor cheap. It would not make sense for our business to market to students.

- press quotes -- Yes, we quote journalists because they're the ones who've written articles about us. You can google "New Yorker sudowrite" etc and see the articles. Some of those journalists also write fiction -- that one who wrote the New Yorker feature had a book he co-wrote with AI reviewed in The New York Times.

> I then noticed it was last updated in 2020? I highly doubt you guys have been around for that long

So many of these objections feel bizarre to me because they're trivial to fact-check. Here's a New York Times article that mentions us, written in 2020. We were one of the first companies to use LLMs in this wave and sought and gained access to GPT-3 prior to public API availability. https://www.nytimes.com/2020/11/24/science/artificial-intell...

creesch

5 days ago

> Wow, so many assumptions here that don't make sense to me, but I realize we all have different perspectives on this stuff.

I realize they don't make sense to you, otherwise the website would contain different information. If I had to try to frame it more clearly I'd say that for a company whose core product revolves around clear writing, your website's information is surprisingly vague and evasive in some areas. I simply think it would make for a stronger and confident message if that information was just there. Which, might I remind you, I have said is true for many companies selling LLM based services and products.

> language support: I don't believe we're being disingenuous. Sudowrite works well in many languages.

I am sure it does, those languages with the highest presence in the training data. French and German doing well doesn't surprise me given the numbers I have seen there. I think this FAQ section could be much clearer here.

> we do not market to students as you claim.

I guess that your pricing page specifically has a "Hobby & Student" tier which mentions "Perfect for people who write for fun or for school" doesn't count as marketing to students?

> I don't believe AI use in education is unethical.

Neither do I, if it is part of the curriculum and the goal. For many language related course including writing using assistive tooling, certainly tooling that highly impacts style defeats the didactic purpose.

> So many of these objections feel bizarre to me because they're trivial to fact-check. Here's a New York Times article that mentions us, written in 2020. We were one of the first companies to use LLMs in this wave and sought and gained access to GPT-3 prior to public API availability.

Okay, I already went out of my way to go over the entire website because you asked. I am not doing a hit piece on you guys specifically as I specifically said you just happened to be linked by the other person. It was an assumption on my side, but reasonable given the age of most LLM companies. More importantly, that is not the main point I am making there anyway.

Since 2020 the landscape around LLMs changed drastically. Including the way privacy is being handled and looked at. You would think that this would result in changes to the policy in that period. In fact, I would think that the introduction of your own model would at the very least warrant some changes there. Not to mention that using copy pasted boilerplate for 5 years to me does not give a confident signal about how seriously you are taking privacy.

While you are not obligated to respond to me as I am just one random stranger on the internet. I would be remiss if I didn't make it clear that it is the overall tone and combined points that make me critical. Not just the ones that piqued your interest.

tiborsaas

5 days ago

Maybe because these companies are smaller and fly under the radar. They require less funding team size is small, probably bankrolled by the founders.

At least that's what I do and what I see among friends.

danShumway

5 days ago

So as an example of what this could look like that would be convincing to me. I started out pretty firmly believing that Rust was a fad.

Then Mozilla and Google did things with it that I did not think were possible for them to do. Not "they wrote a bunch of code with it", stuff like "they eliminated an entire class of bugs from a section of their codebase."

Then I watched a bunch of essentially hobby developers write kernel drivers for brand new architecture, and watched them turn brand new Macbooks into one of the best-in-class ways to run Linux. I do not believe they could have done that with their resources at that speed, using C or C++.

And at that point, you kind of begrudgingly say, "okay, I don't know if I like this, but fine, heck you, whatever. I guess it might genuinely redefine some parts of software development, you win."

So this is not impossible. You can convince devs like me that your tools are real and they work.

And frankly, there are a billion problems in modern computing that are high impact - stuff like Gnome accessibility, competitive browser engines, FOSS UX, collaboration tools. Entire demographics who have serious problems that could be solved by software if there was enough expertise and time and there were resources to solve them. Often, the issue at play is that there is no intersection between people who are very well acquainted with those communities and understand their needs, and people who have experience writing software.

In theory, LLMs help solve this. In theory. If you're a good programmer, and suddenly you have a tool that makes you 4x as productive as a developer: you could have a very serious impact on a lot of communities right now. I have not seen it happen. Not in the enterprise world, but also not in the FOSS world, not in communities with lower technical resources, not in the public sector. And again, I can be convinced by this, I have dismissed tools that I later switched opinions on because I saw the impact and I couldn't ignore the impact: Rust, NodeJS, Flatpak, etc, etc.

The problem is people have been telling me that Coding Assistants (and now Coding Agents) are one of those tools for multiple years now, and I'm still waiting to see the impact. I'm not waiting to see how many companies pick them up, I'm not waiting to see the job market. I'm waiting to see if this means that real stuff starts getting written at a higher quality significantly faster, and I don't see it.

I see a lot of individual devs showing me hobby projects, and a lot of AI startups, and... frankly, not much else.

chinchilla2020

5 days ago

and builder.ai just filed for bankruptcy after a billion dollar valuation. Timely.

tschellenbach

5 days ago

One of our AI enabled internal projects is moving ~135 faster than before. Of course you can't perfectly compare. New framework, better insights, updated libraries etc.

mrkeen

5 days ago

Bless your metric.

If you end up finishing it in 6 months, are you going to revise that estimate, or celebrate the fact that you don't need to wait until 2092 to use the project?

0x500x79

4 days ago

Yeah, these 100x numbers being thrown out are pretty wild. It dawned on me during the Shopify CEO post awhile back. 100x is unfathomable!

You did a years worth of work in 3 days? That is what 100x means.

DirkH

5 days ago

Lol, I appreciate your reality check

Fredkin

5 days ago

My main concern is not even mentioned in this article and there are hardly any comments here addressing it: Privacy / allowing 3rd parties to read and potentially train on your proprietary source code.

I've used LLMs to crank out code for tedious things (like generating C-APIs and calling into poorly documented libraries) but I'm not letting them touch my code until I can run it 100% locally offline. Would love to use the agentic stuff but from what I've heard it's still too slow to run on a high end workstation with a single 4080.

Or have things got better lately, and crucially is there good VisualStudio integration for running local agents / LLMs?

oblio

5 days ago

The problem is, it's like ads. Even if it's a paid subscription, the incentive to hoover up your code, metadata about your code or everything they can get away with is just too strong. I think it will just happen at some point, unless you run them 100% locally or hosted somewhere but self managed.

Fredkin

5 days ago

Yes and even if they were squeaky clean and didn't do anything bad with your code, there's also the possibility that they've been hacked and code is getting siphoned off somewhere.

svachalek

5 days ago

If you're big enough, you pay for on-prem frontier level tools. If not, you can do all right with a good GPU and local model, but it's not the same. And yeah there are plenty of plugins like Continue that support local or on-prem hosted models.

julvo

5 days ago

Do you store code on GitHub? If so, how is GH's guarantee to not use your code different from Cursor's (with privacy mode enabled)?

Fredkin

5 days ago

No I definitely don't use GitHub. Everything is entirely in-house.

But even if I did, there's a much more solid foundation of trust there, whereas these AI companies have been very shady with their 'better to ask for forgiveness, than permission' attitudes of late.

YmiYugy

5 days ago

All the model providers have offerings that promise not to train on your code. Can you trust them not to do it anyway? Maybe not. What's the actual damage if they did? You have to balance the expected productivity loss from forgoing the use of these tools with the risk that comes from sharing your code with them. You may want to reevaluate that somewhat frequently. I think there is a tendency in some circles to be a little bit to precious with their code.

julvo

5 days ago

Fair enough. In that case small models like devstral [1] are probably your best bet

[1] https://mistral.ai/news/devstral

theappsecguy

5 days ago

Cursor has no privacy mode whatsoever. I have been able to exfiltrate just about anything from it.

lomase

5 days ago

Do you commit your client code to Github?

lubujackson

5 days ago

Cursor has "privacy mode" in Pro mode that supposedly prevents training on your code.

hoofhearted

5 days ago

So would you be interested in a simple serverless service that lets you run LLM’s and is focused on privacy and security first instead of trying to run it on an expensive workstation with a single 4080?

dabockster

5 days ago

> but I'm not letting them touch my code until I can run it 100% locally offline

You can do this already with Ollama, RooCode, and a Docker compatible container engine.

OrangeMusic

3 days ago

Honestly is your code that revolutionary? Let's say I steal it, what can I do with it?

lucasyvas

5 days ago

I’m AI neutral but the writing style here is pretty dismissive, and - to match the tone of the article - annoying as fuck.

Most completely reasonable objections to LLMs were totally dismissed.

metalrain

5 days ago

Seems like very common AI hype post style.

"If you aren't using this AI feature you are left behind."

I usually block people for posting like that.

12_throw_away

5 days ago

Right? It's gripping reading about the author's insecurities, but fails to say anything useful about AI.

Anamon

11 hours ago

Yeah, this was one of the weakest pieces of writing on GenAI that I've ever read, and I've read a lot of it.

We should flat-out dismiss the issue of licenses and copyright violations because... all developers pirate TV shows?

Seriously... what a load of crap.

lomase

5 days ago

All this smart people is dumber than me.

stavros

5 days ago

I love LLMs, and I really like programming with Cursor, but I never managed to get the "agents with tons of stuff in their context" mode to work for me. I use Cursor like a glorified code completer, 4-5 lines at a time, because otherwise the LLM just makes too many mistakes that compound.

If you let it run in the "write my code for me" mode, and ask it to fix some mistake it made, it will always add more code, never remove any. In my experience, in the end the code just ends up so brittle that the LLM will soon get stuck at a point that it never manages to overcome some mistake, no matter how many times it tries.

Has anyone managed to solve this?

jks

5 days ago

Not a full solution, but one thing I've learned not to do is tell Cursor "you got that wrong, fix it like this". Instead, I go back to the previous prompt and click "Restore Checkpoint", edit the prompt and possibly the Cursor rules to steer it in the right direction.

When the model has the wrong solution in its context, it will use it when generating new code, and my feeling is that it doesn't handle the idea of "negative example" very well. Instead, delete the bad code and give it positive examples of the right approach.

lubujackson

5 days ago

Make sure you include lots of files of context (including any type definitions!) After a big change before approving any code ask it: "Is this the simplest and cleanest approach?" Usually it will make more revisions and actually produce clean code then. You can also say that in the original prompt, or specify the structure of the change you want it to make.

stavros

5 days ago

Oh nice, I'll try the simple thing, thanks!

csomar

5 days ago

No. And I don't think they are doing anything magical. Performance drop sharply after 50k tokens. Your LLM does best when you have a short 2-5K context.

mikeshi42

5 days ago

imo cursor's small context window on the $20 plan is what kills their agent mode. Try claude code or anything that lets you use larger context windows (I think cursor has a tier you can pay for now?). claude code regularly deletes and corrects code and LLMs are very much capable of that today.

__grob

5 days ago

I would love to know as well. I also have problems with the LLM coding agents when it comes to lesser-known languages like Julia. Has anybody solved this yet?

sv0t

5 days ago

in the last two weeks I've started using Cursor in the 'you do everything, I'm just your boss mode', to see how far I can push it. Right at this minute, I'm working on something thats become pretty big but I'm constantly on the verge of just going back to writing code like normal LOL.

That said, I'm super impressed by how quickly I've been able to get this far with something pretty niche and complex in places.

Here's what I've learned. There are million AI Bros on youtube who have the ultimate solution but they all boil down to a few basic things.

Make rules: make them before you get started and continue updating them as you go.

Lots of tiny modules: Push things into small bite sized bits of code with lots of interface documentation. This feels a bit unnatural when the code is prototype quality.

Documentation is key: the youtubers will often create detailed planning and specification documents in advance. I done this and it's hit and miss, what I've found works is explain what you want to build to an llm and have it create extremely concise documentation, then a rough checklist for implementation and then evolving these documents in cursor as I go.

This leads to a kind of plan --> document, implement --> document, run tests --> document workflow on each reasonable chunk of the design being the most smooth.

Don't let cursor ever see inside big datasets or huge folder trees. In fact keep cursor in a few folders writing source code and nothing else. To do this, early on build tools that can go and get information or make safe edits for cursor from those datasets without it attempting direct access.

The current project has a tools for working with the primary datasets, a job manager, a process manager, a log manager, a workflow manager, these all have functions for querying. Cursor is instructed to use these. It naturally doesn't want to but if you tell it 7 times out 10 it will :)

No mess: Cursor will like to make lots of random tests and processes along the way. instruct it to use a /debug/tests/ folder and wipe it clean often. force it to make 'production' code by having it be registered with the workflow manager and made allowable by process manager. This lets it play around for a while and get something up and running with it's weird little scripts and then implement it for real elsewhere and using the proper framework. The workflow manager needs to have documentation on how the script is used and the process manager needs to the source to be in a particular place with a particular set of standard interfaces.

you might say this is a lot of plumbing, but what isn't these days and it's not like I'm maintaining it - right ;)

Cursor is always about to go insane any minute, or can't remember what's going on. So most of it is about keeping things simple enough for it to focus on the bit it's working on and nothing else.

Regularly for work we have massive code bases written in our company style and built using our business processes. This kind of thing is absolutely not the place for Cursor. I can imagine a company setup from scratch to use Cursor but our current products and practices woud just make it impossibly time-consuming.

nitwit005

6 days ago

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.

A junior developer often has negative value to a team, because they're sapping the time of more senior developers who have to help train them, review code, fix mistakes, etc. It can take a long while to break even.

The raw cost of Cursor's subscription is surely dwarfed by your own efforts, given that description. The actual calculous here should be the cost to corral Cursor, against the value of the code it generated.

fud101

5 days ago

So what is the end game here, we don't have any juniors? That's what we're celebrating? This whole thing makes no damn sense, we're just becoming more efficient at producing bad code at scale, which will be maintained by a workforce that is growing less proficient at that with the increasing adoption of AI - skills will decline and eventually we'll be living with more buggy software, maintained by underpaid, unappreciated and overwhelmed grey beards and it will be too late to turn back.

elric

5 days ago

I had a call with a recruiter the other day, who told me that they had simply stopped hiring juniors at all, when not too long ago they would be camping out at colleges on graduation day to snipe fresh graduates.

This is going to a problem real soon. There needs to be a realistic career path for software developers, or the entire field is at risk.

620gelato

5 days ago

This. No more juniors, and the skills of the seniors is going to atrophy when all they're doing is review and "move around" code an LLM spat out.

YmiYugy

5 days ago

I don't think that's necessarily true. Some companies and seniors certainly invest into training and mentoring their juniors and this can take a long time and be expensive. But there are a lot of places offer 3-6 months internships for people that are barely out of high school. Others heavily rely on overseas contractors. So in a lot of scenarios it does seem possible to use less capable labor in a beneficial way for software engineering.

Verdex

5 days ago

Hundreds of comments. Some say LLMs are the future. Others say they don't work today and they won't work tomorrow.

Videogame speed running has this problem solved. Livestream your 10x engineer LLM usage, a git commit annotated with it's prompt per change. Then everyone will see the result.

This doesn't seem like an area of debate. No complicated diagrams required. Just run the experiment and show the result.

M4v3R

5 days ago

I literally do this about twice a week on Twitch. I’m working on some hobby projects, and one constraint I’ve put on myself is to use LLMs for pretty much everything, regardless of whether it saves me time or not. The reason is twofold: I want to learn how to use them efficiently, and I want to constantly test the waters to see where the limits of their capabilities are. On my stream, you can see LLMs fail spectacularly one time and do hours of work in minutes another time.

I’m not alone in this - there are tons of other examples of people showing how they use LLMs online; you just need to search for them.

Verdex

5 days ago

Thanks. I'll give them a watch.

chinchilla2020

5 days ago

Agreed.

The article provides zero measurement, zero examples, zero numbers.

It's pure conjecture with no data or experiment to back it up. Unfortunately conjecture rises to the top on hackernews. A well built study on LLM effectiveness would fall off the front page quickly.

dbalatero

5 days ago

I'd honestly love to see this.

People always say "you just need to learn to prompt better" without providing any context as to what "better" looks like. (And, presumes that my prompt isn't good enough, which maybe it is maybe it isn't.)

The easy way out of that is "well every scenario is different" - great, show me a bunch of scenarios on a speed run video across many problems, so I can learn by watching.

theshrike79

5 days ago

It's because you get to the No True Scotsman -thing pretty fast.

If I use LLMs to code, say a Telegram bot that summarise the family calendars and current weather to a channel - someone will come in saying "but LLMs are shit because they can't handle this very esoteric hardware assembler I use EVERY DAY!!1"

lomase

5 days ago

But... do you know anybody who will give me 50k a year to write telegram bots for them?

theshrike79

5 days ago

It's not like every single problem you face at work is breaking new ground and exploring unknown reaches of computer science. If so, please hire me.

You will be writing CRUD operations and slapping together web apps on every level of experience. Even in (mobile) gaming there you're repeating the same structures as every game before.

Not a 100% of the time, but way more than 50%.

dbalatero

5 days ago

I'd like to think I could dismiss those responses and still learn something from a collection of samples.

blendaddict

2 days ago

The Primeagen did this. He locked himself in for 7 weeks and tried to build a game by only using cursor. The videos are more entertainment style but this is the closest thing i know of

https://www.youtube.com/watch?v=6x3weFV9YI4

weddpros

5 days ago

AI Coding is becoming an edge, and sharing your edge isn't the wisest thing to do, even more so when doubt is so prevalent!

Verdex

5 days ago

Extremely doubtful.

This thread has hundreds of comments where people are screaming that everyone needs to learn AI coding.

If it was such an edge would they not otherwise keep quiet?

wiseowise

5 days ago

Because there are forces that are trying to kill the momentum.

Imagine that there was a serum that gives you superhuman strength only under specific conditions that you’re supposed to discover. Then there’s half room who screams that it should be banned, because it is cheating/fake/doesn’t work. And there’s another half room that swears by it, because they know how to utilize it properly.

You know it works and you don’t want to give up your secret sauce or make another half of the room stronger.

quikoa

5 days ago

There are many billions of dollars at stake to try and gain momentum. That's just the direct investment from the AI companies. The entire tech sector wants it to succeed so they can save on salary from expensive engineers.

wiseowise

5 days ago

Paraphrasing the article: I don’t care, it makes me productive and allows me to enjoy work again and deliver business value.

quikoa

5 days ago

Did you just reply to the wrong comment? This doesn't say anything about momentum.

goatlover

5 days ago

Those forces are up against tens of billions of dollars and very well connected CEOs. Which is why I take overly positive views on LLMs with a grain of salt.

Verdex

5 days ago

Real life doesn't have superhuman serums.

However real life does have illicit drugs that many people hype up and claim that they need.

Also real life has performance enhancement drugs that cause a host of medical issues.

Even drugs for medical necessity come with a list of side effects.

dolebirchwood

5 days ago

Or they just aren't sociopaths, so they don't think that way?

GenshoTikamura

5 days ago

Quiet doesn't bring your AI corporation profits up.

YmiYugy

5 days ago

Unlikely. Programming in highly collaborative and efficiency is hard to measure. That creates incentives for programmers in competition to typically prioritize advertising their skills by demonstration over maintaining their secret edge. Be it at work or on the internet, if you help others by sharing your techniques you'll make them want to work with you and impress them with how smart you are. If you are keeping it all secret to maintain your edge, people will think of you as unhelpful and they won't know how smart you are, because it's very difficult to judge how difficult the things our accomplished were. The reason people don't stream themselves vibe coding is that's it's even less interesting to watch than regular coding.

mrkeen

5 days ago

It's not your edge anymore. It's AI's. And how is TFA any different?

jansan

5 days ago

Someone will always be idealistic enough to share. The fact that we do not see them now should raise a few eyebrows.

Nasrudith

5 days ago

I must disagree. Sharing your edge is the wisest possible thing you can do on a societal level. For a slightly silly idea would it be better to have say, everyone doing guesswork knots for how to tie their shoes vs a single reliable 'rabbit ears' technique? Then you can see the benefits to having edges widely shared as a norm. That is the foundation of how society can learn.

lomase

5 days ago

Just like blockchain, smart contracts and web 3.0 are the future!

lubujackson

5 days ago

I see all the negative responses, but this seems true to me. I am old enough to remember the dot com days and could see the transformative effect of the Internet from miles away when I was a teenager. Yet many, many people refused to acknowledge that someday soon we would do things like credit card transactions online, or that people might buy shoes without trying them on first, etc.

You could say it is a lack of imagination or not connecting the dots, but I think there is a more human reason. A lot of people don't want the disruption and are happy with the status quo. I'm a software engineer so I know how problematic AI may be for my job, but I think anyone who looks at our current state and the recent improvements should be able to see the writing on the wall here.

I for one am more curious than afraid of AI, because I have always felt that writing code was the worst part of being a programmer. I am much happier building product or solving interesting problems than tracking down elusive bugs or refactoring old codebases.

Anamon

10 hours ago

I disagree with that. I was around when the web grew into the mainstream, and almost everybody was sure that it would have a huge impact on every industry and activity it touched. There wasn't even remotely a level of skepticism comparable to those around VR, blockchain, and now GenAI.

And it seems pretty obvious why. The benefits were clear and palpable. Communication was going to become a heck of a lot easier, faster, cheaper, barriers were being lowered.

There's no such qualitative advantage offered by GenAI, compared to the way we did things before. Web vs. pre-Web, the benefits were clear.

GenAI? Some execs claim it's making stuff cheaper, but it doesn't consider quality and long-term effects, plus it's spouted by those with no technological knowledge and with a reputation to long have cashed out and moved on by the time their actions crash a company. Plus, still nobody seems to have figured out how to make money (real money, not VC) off of this. Faster -- again, at what price to quality?

Then there's the predictions. We've been told for about three years now about the explosive rise in quality we'll see from GenAI output. I'm still waiting. The predictions of wider spread, higher speed and lower cost of the web sounded plausible, and they materialised. Comparatively, I see a lot of very well-reasoned arguments for the hypothesis that GenAI has peaked (for now) and this is pretty much as good as it's going to get, with source data sets exhausted and increasingly polluted by poor GenAI slop. So far, the trajectory makes me believe this scenario to be a lot more likely.

None of this seems remotely comparable to the Internet or web cases to me. The web certainly didn't feel like a hype to me in the 90s and I don't remember anyone having had that view.

GenshoTikamura

5 days ago

So, programmers once had an edge in having their source closed, then fell for the open source evangelism and started sharing their code, which enabled the training of AI models, and now the next iteration of what was called programmers before and is now known as vibe coders has this notion of having an edge in having their chatbot prompts closed again?

Let's all just muse some and imagine what the next cycle of this wheel will look like.

pie_flavor

6 days ago

I have one very specific retort to the 'you are still responsible' point. High school kids write lots of notes. The notes frequently never get read, but the performance is worse without them: the act of writing them embeds them into your head. I allegedly know how to use a debugger, but I haven't in years: but for a number I could count on my fingers, nearly every bug report I have gotten I know exactly down to the line of code where it comes from, because I wrote it or something next to it (or can immediately ask someone who probably did). You don't get that with AI. The codebase is always new. Everything must be investigated carefully. When stuff slips through code review, even if it is a mistake you might have made, you would remember that you made it. When humans do not do the work, humans do not accrue the experience. (This may still be a good tradeoff, I haven't run any numbers. But it's not such an obvious tradeoff as TFA implies.)

sublinear

6 days ago

I have to completely agree with this and nobody says this enough.

This tradeoff of unfamiliarity with the codebase is a very well understood problem for decades. Maintaining a project is 99% of the time spent on a successful project.

In my opinion though, having AI write the initial code is just putting most people in a worse situation with almost no upside long term.

CurrentB

6 days ago

I agree I'm bullish on AI for coding generally, but I am curious how they'd get around this problem. Even if they can code at super human level, then you just get rarer super human bugs. Or is another AI going to debug it? Unless this loop is basically fail proof, does the human's job just becoming debugging the hardest things to debug (or at least a blindspot of the AI)

runevault

5 days ago

This comment reminds me of the old idiom (I cannot remember who is credited with it) that you should be careful not to use your full abilities writing code, because you have to be more clever to debug code than you were to write it.

This type of issue is part of why I've never felt the appeal of LLMs, I want to understand my code because it came from my brain and my understanding, or the same said of a teammate who I can then ask questions when I don't understand something.

miningape

5 days ago

I believe it was Brian Kernigan

ethagnawl

5 days ago

I haven't seen enough mention of using these tools to generate formal verification specs for their output, like TLA+. Of course, you're stuck with the same problem of having to verify the specs but you'll always be playing this game and it'd seem like this would be one of best, most reassuring ways to do so.

I'll have the look into this some more but I'm very curious about what the current state of the art is. I'm guessing it's not great because so few people do this in the first place -- because it's so tedious -- and there's probably not nearly enough training data for it to be practical to generate specs for a JavaScript GQL app or whatever these things are best at generating.

the_sleaze_

5 days ago

> becoming debugging the hardest things to debug

This is my current role, and one of the biggest reasons AI doesn't really help me day to day agent or otherwise.

In my ideal world, AI become so proficient at writing code that they eventually develop their own formally verifiable programming language, purpose built to be verifiable. So that there wouldn't be room for unknown unknowns.

skissane

6 days ago

> When stuff slips through code review, even if it is a mistake you might have made, you would remember that you made it.

I don’t know. Ever had the experience of looking at 5+ year old code and thinking “what idiot wrote this crap” and then checking “git blame” and realising “oh, I’m the idiot… why the hell did I do this? struggling to remember” - given enough time, humans start to forget why they did things a certain way… and sometimes the answer is simply “I didn’t know any better at the time, I do now”

> You don't get that with AI. The codebase is always new.

It depends on how you use AI… e.g. I will often ask an AI to write me code to do X because it gets me over the “hump” of getting started… but now this code is in front of me on the screen, I think “I don’t like how this code is written, I’m going to refactor it…” and by the time I’m done it is more my code than the AI’s

mrguyorama

6 days ago

Understanding code takes more effort than writing it, somehow. That's always been a huge problem in the industry, because code you wrote five years ago was written by someone else, but AI coding takes that from "all code in your org except the code you wrote in the past couple years" to "all code was written by someone else".

How well does your team work when you can't even answer a simple question about your system because nobody wrote, tested, played with the code in question?

How do you answer "Is it possible for our system to support split payments?" when not a single member of your team has even worked on the billing code?

No, code reviews do not familiarize an average dev to the level of understanding the code in question.

Mofpofjis

6 days ago

> Understanding code takes more effort than writing it

yes!

> somehow

not difficult to explain. Coding is a creative activity where you work top-down; you decompose the abstract/high-level into the detailed/low-level. You dictate the "events" happening to the code, you are in control. Reviewing is reactive; the code you review dictates what happens in your brain (you are under control, not in control), and you need to work bottom-up: you try to re-compose the whole from the fragments. Even for human coders, a detailed commit message is a pre-requisite before their code can undergo review. The reviewer is in the worse position, so he needs to be supported as much as possible.

LegionMammal978

6 days ago

Oddly, I don't tend to get that experience very much. More often, it's "That's not how I'd naively write that code, there must be some catch to it. If only I had the foresight to write a comment about it..." Alas, I'm still not very good at writing enough comments.

OccamsMirror

5 days ago

Now this is where AI assisted coding shines in my opinion.

I find myself both:

- writing a comment so that Copilot knows what to do

- letting Copilot write my comment when it knows what I did

I'm now a lot more reliable with my comment writing.

Mofpofjis

6 days ago

> why the hell did I do this? struggling to remember

- git blame

- always write good commit messages

BirAdam

5 days ago

5 years? You’re a genius. I can’t make sense of stuff I wrote last week.

SchemaLoad

5 days ago

Similar to almost self driving cars where you are still responsible. You're asking someone to do nothing at all other than being highly alert for long periods of time. That's just not how people work. There is no way someone can be ready to take over in an instant without actively engaging in the driving.

stock_toaster

6 days ago

I read a study[1] (caveat, not peer reviewed yet I don't think?) that seems to imply that you are correct.

  < When using GenAI tools, the effort invested in critical thinking
  < shifts from information gathering to information verification; 
  < from problem-solving to AI response integration; and from task
  < execution to task stewardship.

[1]: https://www.microsoft.com/en-us/research/wp-content/uploads/...

wayvey

6 days ago

This is a good point I think, and these steps take time and should definitely be done. I'm not sure people take this into account when talking about having AI code for them.

derefr

6 days ago

So do the thing that a student copying their notes from the board does: look at the PR on one monitor, and write your own equivalent PR by typing the changes line-for-line into your IDE on the other. Pretend copy/paste doesn’t exist. Pretend it’s code you saw in a YouTube video of a PowerPoint presentation, or a BASIC listing from one of those 1980s computing magazines.

(And, if you like, do as TFA says and rephrase the code into your own house style as you’re transcribing it. It’ll be better for it, and you’ll be mentally parsing the code you’re copying at a deeper level.)

roarcher

6 days ago

You still didn't have to build the mental model, understand the subtle tradeoffs and make the decisions that arrived at that design.

I'm amazed that people don't see this. Absolutely nobody would claim that copying a novel is the same thing as writing a novel.

the_snooze

6 days ago

I feel like the dismissal of mental models is a direct consequence on the tech industry's manaical focus on scale and efficiency as the end-all be-all values to optimize.

Nevermind other important values like resilience, adaptability, reliability, and scrutability. An AI writes a function foo() that does a thing correctly; who has the know-how that can figure out if foo() kills batteries, or under what conditions it could contribute to an ARP storm or disk thrashing, or what implicit hardware requirements it has?

Anamon

10 hours ago

Hunter S. Thompson claimed to have re-typed The Great Gatsby because he wanted to know what it feels like to write a great novel.

Apparently, he actually meant this as a somewhat serious piece of writing advice, but I still prefer my initial reading of it as sarcasm.

derefr

6 days ago

I am suspicious of this argument, because it would imply that you can’t understand the design intent / tradeoffs / etc of code written by your own coworkers.

Which: of course you can. Not least because both your coworkers and these coding agents produce changes with explanatory comments on any lines for which the justification or logic is non-obvious; but also because — AI PR or otherwise — the PR consists of commits, and those commits have commit messages further explaining them. And — AI submitter or otherwise — you can interrogate the PR’s submitter in the PR’s associated discussion thread, asking the submitter to justify the decisions made, explain parts you’re suspicious of, etc.

When you think about it, presuming your average FOSS project with an open contribution model, a PR from an AI agent is going to be at least strictly more “knowable” than a “drive-by” PR by an external one-time contributor who doesn’t respond to discussion-thread messages. (And sure, that’s a low bar — but it’s one that the average accepted and merged contribution in many smaller projects doesn’t even clear!)

roarcher

6 days ago

You understand your coworkers' PRs as thoroughly and intuitively as they do? Any significant PR will contain things you don't even notice or think to ask about. And the answers to the questions you do ask are the end result of a thought process you didn't go through and therefore also don't understand as deeply.

Back to the novel analogy, you could ask an author why he incorporated this or that character trait or plot point, but all the explanation in the world will not make you able to write the next chapter as well as he could.

chaps

6 days ago

This is how a (video game) programming class in my high school was taught. You had to transcribe the code from a Digipen book.... then fix any broken code. Not entirely sure if their many typos were intentional, but they very much helped learn because we had no choice but to correct their logic failures and taypos to move onto the next section. I'm still surprised 20 years later how well that system worked to teach and push us to branch our understandings.

SirHumphrey

6 days ago

Yes, I was just about to say. Typing out code is a way to lear syntax of a new language and it’s often recommended to not copy paste while you start learning.

galleywest200

6 days ago

Is this just repeating labor? Why not just write it all yourself in the first place if you are just going to need to copy it over later?

ezst

6 days ago

> The codebase is always new. Everything must be investigated carefully.

That's dreadful. Not only is familiarity with the code not valued, it is impossible to build for your own sake/sanity.

ddddang

6 days ago

100%, i had gemini write code for a blog in golang - it has some bugs and it took me a some time to find them.

To me the sweet spot is, i write the code with the "Help" of an LLM. It means i double check everything it generates and prompt it to write code block by block - frequently acting as an editor.

Either you want human intervention for correctness and extension or you don't. Having LLM's write large swaths of code is like completely relying on tesla's autopilot - you are probably more stressed than if you just drove yourself.

wayvey

6 days ago

The careful vetting of code and thoroughly testing it is super important, I would never even think of putting any generated code into any use without doing that.

Also your last comparison made me chuckle, good one :)

kubav027

6 days ago

Writing code is easier than long term maintenance. Any programmer is able to write so much code that he will not be able to maintain it. Unless there are good AI tools helping with maintenance there is no point to use generative tools for production code. From my experience AI tools are great for prototyping or optimizing procrastination.

tabletcorry

6 days ago

This level of knowledge is nearly impossible to maintain as the codebase grows though, beyond one or two people at a typical company. And tools need to exist for the new hire as well as the long-standing employee.

ezst

6 days ago

Welcome to project architecting, where the job isn't about putting more lines of code into this world, but more systems in place to track them. A well layered and structured codebase can grow for a very long time before it becomes too hard to maintain. And generally, the business complexity bites before the algorithmic one, and there's no quick fix for that.

throw_nbvc1234

6 days ago

It's cultural too. I've heard people say along the lines "we don't ship the org chart here" in a positive light, then in a later meeting complain that nobody understands what's going on in their owner-less monorepo.

Shipping the org chart isn't the only way to solve this problem but it is one that can work. But if you don't acknowledge the relationship between those problems, AGI itself probably isn't going to help (partially sarcastic).

0x1ceb00da

5 days ago

Talking to an LLM feels like talking to Leonard Shelby from memento. https://youtube.com/watch?v=Y3kNTvXVHvQ

xnx

5 days ago

AI tools are enabling the developer to a higher level of abstraction to engineering manager or product manager. Those roles do not need to be familiar with code in that detail.

Mofpofjis

6 days ago

This is it. Reading AI slop does not form synapses in your brain like writing the code yourself does.

JoshTriplett

6 days ago

Exactly. See also https://hazelweakly.me/blog/stop-building-ai-tools-backwards... for a detailed look at this aspect of AI coding.

mgraczyk

6 days ago

The important thing you are missing is that the learning landscape has now changed.

You are now responsible for learning how to use LLMs well. If an untrained vibe coder is more productive for me, while knowing nothing about how the code actually works, I will hire the vibe coder instead of you.

Learning is important, but it's most important that you learn how to use the best tools available so you can be productive. LLMs are not going away and they will only get better, so today that means you are responsible for learning how to use them, and that is already more important for most many roles than learning how to code yourself.

Mofpofjis

6 days ago

This is actually a good reason for exiting the industry before one's job goes away. Steering AI to barf up the right-looking pool of vomit is not the Flow-generating experience that many people have started to program for.

GenshoTikamura

5 days ago

There is room to move up the developer hierarchy at the company I work for, but I refuse to take that path for this very reason. The leadership has bought into AI as some kind of panacea, plus the company's plans to replace hundreds of human administrators in our B2C operations with AI strike me as downright evil.

nancyminusone

5 days ago

Title nitpick: The amount is people who care about AI for coding assistance is a relative minority. For everyone else, there's 'AI', which has a huge branding problem. 'AI' is filling all search results with trash, and creating trash websites full of trash to fill up the rest of the search results. It's generating trash images to put at the top of every blog post. It's eating up all the server load with scraping. It's what's been fooling my dad every day on Facebook.

When people are sick of AI, this is what they are talking about. AI hype people ignore this perspective each and every time.

It doesn't matter how great your paper Mill's paper is, if you're dumping PCBs in the river, people are going to quite rightly get pissed off.

mierz00

5 days ago

AI has a huge branding problem on HN but I haven’t found that to be true outside of tech.

Many people are very eager to see how AI can improve their lives.

I’m building a product for accountants that uses AI, initially I avoided using AI as a part of marketing copy. Turns out, people are much more interested when I tell them it uses AI behind the scenes.

xk_id

5 days ago

It actually varies by segment. The least technical people are enthusiastic about AI. The moderately technical (e.g tech hobbyists, artists and power users) are repulsed. The segment of tech professionals are themselves divided, but I think based on how much they like/hate their career as programmers. Either way, what’s very different about this AI bubble is that the early adopters are predominantly the least technical users. This is a notable inversion of the typical adoption curve.

cobertos

6 days ago

It's been so much more rewarding playing with AI coding tools on my own than through the subtle and not so subtle nudges at work. The work AI tools are a walled garden, have a shitty interface, feel made to extract from me than to help me. In my personal stuff, downloading models, playing with them, the tooling, the interactions, it all been so much more rewarding to give me stable comfortable workflows I can rely on and that work with my brain.

The dialog around it is so adversarial it's been hard figuring out how to proceed until dedicating a lot of effort to diving into the field myself, alone, on my personal time and learned what's comfortable to use it on.

j-bos

6 days ago

Exactly, seems much skepticism comes from only scratching the surface of what's possible.

FridgeSeal

6 days ago

Is there a term for “skeptics just haven’t used it enough” argument?

Because it frequently got rolled out in crypto-currency arguments too.

steveklabnik

6 days ago

I do think that's a poor argument, but there's a better version: tools take skills to use properly.

The other day, I needed to hammer two drywall anchors into some drywall. I didn't have a hammer handy. I used the back of a screwdriver. It sucked. It even technically worked! But it wasn't a pleasant experience. I could take away from this "screwdrivers are bullshit," but I'd be wrong: I was using a tool the wrong way. This doesn't mean that "if you just use a screwdriver more as a hammer, you'll like it", it means that I should use a screwdriver for screwing in screws and a hammer for hammering things.

cesarb

5 days ago

> Is there a term for “skeptics just haven’t used it enough” argument?

It's not an exact match to what you want, but "you're holding it wrong" is the closest I've found. (For those too young to have heard of it, it was an infamous rebuttal to criticism of a particular model of the iPhone: https://en.wikipedia.org/wiki/iPhone_4#Antenna)

AnimalMuppet

5 days ago

And Lisp arguments, and Haskell arguments, and FP in general arguments.

"You can't actually disagree with me. If you don't agree with me you just haven't thought it through/you don't know enough/you have bad motives." (Yeah, we need a better term for that.) You see this all the time, especially in politics but in many places. It's a cheap, lazy rhetorical move, designed to make the speaker feel better about holding their position without having to do the hard work of actually defending it.

oblio

6 days ago

Now, now, be nice. There is value to obtain for the user from current gen AI tools. From cryptocurrencies... uh...

cloogshicer

5 days ago

It's "skill issue" or "git gud".

rotis

5 days ago

The author defends mediocore code, yet wrote this piece:

https://fly.io/blog/vscode-ssh-wtf/

Where he dunks on how SSH access works in VSCode. I don't know. The code and architecture behind this feature may well be bananas, but gets the work done. Sounds like a clear case of mediocority. I wonder how does he reconcile those two articles together.

For me this is more of a clickbait. Both of the articles. With that in mind, if I am nuts for being sceptical of LLMs, I think it is fair to call the author a clickbaiter.

okdood64

5 days ago

How a feature works, and how a feature is coded (how "nice" & canonical the code is) are two very different things.

He's dunking on the design of it.

AnimalMuppet

5 days ago

Hmm. The author is a cryptography expert. So, honest question: Would he trust crypto code written by an AI?

(I suspect I know the answer, but I am not certain of it.)

svachalek

5 days ago

The whole point of crypto is you don't trust code written by anyone, even yourself.

ikrenji

4 days ago

the AI is going to use crypto libraries just like you would?

bigmadshoe

5 days ago

I agree with the main take in this article: the combination of agents + LLMs with large context windows + a large budget of tokens to iterate on problems can probably already yield some impressive results.

I take serious issue with the "but you have no idea what the code is" rebuttal, since it - to me - skims over the single largest issue with applying LLMs anywhere where important decisions will be made based on their outputs.

To quote from the article:

  People complain about LLM-generated code being 
  “probabilistic”. No it isn’t. 
  It’s code. It’s not Yacc output. It’s knowable. The LLM 
  might be stochastic. But the LLM doesn’t matter. What 
  matters is whether you can make sense of the result, and 
  whether your guardrails hold.

  Reading other people’s code is part of the job. If you can’t metabolize the 
  boring, repetitive code an LLM generates: skills issue! How are you handling the 
  chaos human developers turn out on a deadline?

The problem here is that LLMs are optimized to make their outputs convincing. The issue is exactly "whether you can make sense of the result", as the author said, or, in other words: whether you're immune to being conned by a model output that sounds correct but is not. Sure, "reading other people’s code is part of the job", but the failure modes of junior engineers are easily detectable. The failure modes of LLMs are not.

EDIT: formatting

proc0

5 days ago

It's also funny how it requires a lot of iterations for the average task.. and the user has to pay for the failures. No other product has this expectation, imagine a toaster that only toasts bread 20% of the time, and 50% it's half toasted.

NitpickLawyer

5 days ago

> The problem here is that LLMs are optimized to make their outputs convincing.

That may be true for chat aligned LLMs, but coding LLMs are trained w/ RL and rewards for correctness, nowadays. And there are efforts to apply this to the entire stack (i.e. better software glue, automatic guardrails, more extensive tool-use, access to LSP/debuggers/linters, etc).

I think this is the critical point in a lot of these debates that seem to be very popular right now. A lot of people try something and get the wrong impressions about what SotA is. It turns out that often that something is not the best way to do it (i.e. chatting in a web interface for coding), but people don't go the extra mile to actually try what would be best for them (i.e. coding IDEs, terminal agents, etc).

bigmadshoe

5 days ago

Which "coding LLMs" are you referring to that are trained purely on verifiably correct synthetic data? To my understanding o3, gemini 2.5 pro, claude 3.7 sonnet, etc. are all still aligned to human preferences using a reward function learned from human feedback. Any time a notion of success/correctness is deferred to a human, the model will have a chance to "game" the system by becoming more convincing as well as more correct.

Edit: thought I would include this below instead of in a separate comment:

Also, whether the models are trained purely on synthetic data or not, they suffer from these epistemological issues where they are unable to identify what they don't know. This means a very reasonable looking piece of code might be spit out for some out-of-distribution prompt where the model doesn't generalize well.

NitpickLawyer

5 days ago

> To my understanding o3, gemini 2.5 pro, claude 3.7 sonnet, etc. are all still aligned to human preferences using a reward function learned from human feedback.

We don't know how the "thinking" models are trained at the big3, but we know that open-source models have been trained with RL. There's no human in that loop. They are aligned based on rewards, and that process is automated.

> Which "coding LLMs" are you referring to that are trained purely on verifiably correct synthetic data?

The "thinking" ones (i.e. oN series, claudeThinking, gemini2.5 pro) and their open-source equivalents - qwq, R1, qwen3, some nemotrons, etc.

From the deepseek paper on R1 we know the model was trained with GRPO, which is a form of RL (reinforcement learning). QwQ and the rest were likely trained in a similar way. (before GRPO, another popular method was PPO. And I've seen work on unsupervised DPO, where the pairs are generated by having a model generate n rollouts, verify them (i.e. run tests) and use that to guide your pair creation)

bigmadshoe

5 days ago

Sure, it is possible that these models at the big 3 are trained with no human feedback, I personally find it unlikely that they aren't at least aligned with human feedback, which can still introduce a bias in the direction of convincing responses.

You make a fair point that there are alternatives (e.g. DeepSeek r1) which avoid most of the human feedback (my understanding is the model they serve is still aligned by human responses for safety).

I guess I have to do some more reading. I'm a machine learning engineer but don't train LLMs.

jszymborski

5 days ago

The argument that I've heard against LLMs for code is that they create bugs that, by design, are very difficult to spot.

The LLM has one job, to make code that looks plausible. That's it. There's no logic gone into writing that bit of code. So the bugs often won't be like those a programmer makes. Instead, they can introduce a whole new class of bug that's way harder to debug.

vanschelven

5 days ago

This is exactly what I wrote about when I wrote "Copilot Induced Crash" [0]

Funny story: when I first posted that and had a couple of thousand readers, I had many comments of the type "you should just read the code carefully on review", but _nobody_ pointed out the fact that the opening example (the so called "right code") had the exact same problem as described in the article, proving exactly what you just said: it's hard to spot problems that are caused by plausibility machines.

[0] https://www.bugsink.com/blog/copilot-induced-crash/

okanat

5 days ago

If it crashes, you are very lucky.

AI generated code will fuck up so many lives. The post office software in the UK did it without AI. I cannot imagine the way and the number of lives will be ruined since some consultancy vibe coded some government system. I might come to appreciate the German bureaucracy and backwardness.

intrasight

5 days ago

My philosophy is to let the LLM either write the logic or write the tests - but not both. If you write the tests and it writes the logic and it passes all of your tests, then the LLM did its job. If there are bugs, there were bugs in your tests.

underdeserver

5 days ago

That rather depends on the type of bug and what kinds of tests you would write.

LLMs are way faster than me at writing tests. Just prompt for the kind of test you want.

catlifeonmars

5 days ago

Idk about you but I spend much more time thinking about what ways the code is likely to break and deciding what to test. Actually writing tests is usually straightforward and fast with any sane architecture with good separation of concerns.

I can and do use AI to help with test coverage but coverage is pointless if you don’t catch the interesting edge cases.

lelanthran

4 days ago

> My philosophy is to let the LLM either write the logic or write the tests - but not both. If you write the tests and it writes the logic and it passes all of your tests, then the LLM did its job. If there are bugs, there were bugs in your tests.

Maybe use one LLMs to write the code and a wildly different one to write the tests and yet another wildly different one to generate an English description of each test while doing critical review.

joshribakoff

5 days ago

Disagree. You could write millions of tests for a function that simply sums two numbers, and it’s trivial to insert bugs while passing that test.

dfedbeef

5 days ago

This is pretty nifty, going to try this out!

therealmarv

5 days ago

I don't agree. What I do agree on is to do it not only with one LLM.

Quality increases if I double check code with a second LLM (especially o4 mini is great for that)

Or double check tests the same way.

Maybe even write tests and code with different LLMs if that is your worry.

fisherjeff

5 days ago

Yes, exactly - my (admittedly very limited!) experience has consistently generated well-written, working code that just doesn’t quite do what I asked. Often the results will be close to what I expect, and the coding errors do not necessarily jump out on a first line-by-line pass, so if I didn’t have a high degree of skepticism of the generated code in the first place, I could easily just run with it.

otabdeveloper4

5 days ago

> working code that just doesn’t quite do what I asked

Code that doesn't do what you want isn't "working", bro.

Working exactly to spec is the code's only job.

lukan

5 days ago

It is a bit ambiguous I think, there is also the meaning of "the code compiles/runs without errors". But I also prefer the meaning of, "code that is working to the spec".

devjab

5 days ago

For me it's mostly about the efficiency of the code they write. This is because I work in energy where efficiency matters because our datasets are so ridicilously large and every interface to that data is so ridicilously bad. I'd argue that for 95% of the software out there it won't really matter if you use a list or a generator in Python to iterate over data. It probably should and maybe this will change with cloud costs continious increasing, but we do also live in a world where 4chan ran on some apache server running a 10k line php file from 2015...

Anyway, this is where AI's have been really bad for us. As well as sometimes "overengineering" their bug prevention in extremely inefficient ways. The flip-side of this is of course that a lot of human programmers would make the same mistakes.

deanc

5 days ago

I’ve had the opposite experience. Just tell it to optimise for speed and iterate and give feedback. I’ve had JS code optimised specifically for v8 using bitwise operations. It’s brilliant.

mirkodrummer

5 days ago

Example code or it's just a claim :)

mcv

5 days ago

Note that it's a claim in response to another claim. It doesn't need to be held to a higher standard than its parent.

alex989

5 days ago

>Instead, they can introduce a whole new class of bug that's way harder to debug

That sounds like a new opportunity for a startup that will collect hundreds of millions a of dollars, brag about how their new AI prototype is so smart that it scares them, and devliver nothing

DiogenesKynikos

5 days ago

> There's no logic gone into writing that bit of code.

What makes you say that? If LLMs didn't reason about things, they wouldn't be able to do one hundredth of what they do.

mindwok

5 days ago

This is a misunderstanding. Modern LLMs are trained with RL to actually write good programs. They aren't just spewing tokens out.

godelski

5 days ago

No, YOU misunderstand. This isn't a thing RL can fix

  https://news.ycombinator.com/item?id=44163194

  https://news.ycombinator.com/item?id=44068943

It doesn't optimize "good programs". It interprets "humans interpretation of good programs." More accurately, "it optimizes what low paid over worked humans believe are good programs." Are you hiring your best and brightest to code review the LLMs?

Even if you do, it still optimizes tricking them. It will also optimize writing good programs, but you act like that's a well defined and measurable thing.

tptacek

5 days ago

I don't know if any of this applies to the arguments in my article, but most of the point of it is that progress in code production from LLMs is not a consequence of better models (or fine tuning or whatever), but rather on a shift in how LLMs are used, in agent loops with access to ground truth about whether things compile and pass automatic acceptance. And I'm not claiming that closed-loop agents reliably produce mergeable code, only that they've broken through a threshold where they produce enough mergeable code that they significantly accelerate development.

dwohnitmok

4 days ago

> I don't know if any of this applies to the arguments in my article, but most of the point of it is that progress in code production from LLMs is not a consequence of better models (or fine tuning or whatever), but rather on a shift in how LLMs are used, in agent loops with access to ground truth about whether things compile and pass automatic acceptance.

I very strongly disagree with this and think this reflects a misunderstanding of model capabilities. This sort of agentic loop with access to ground truth model has been tried in one form or another ever since GPT-3 came out. For four years they didn't work. Models would very quickly veer into incoherence no matter what tooling you gave them.

Only in the last year or so have models gotten capable enough to maintain coherence over long enough time scales that these loops work. And future model releases will tighten up these loops even more and scale them out to longer time horizons.

This is all to say that progress in code production has been essentially driven by progress in model capabilities, and agent loops are a side effect of that rather than the main driving force.

tptacek

4 days ago

Sure! Super happy to hear these kinds of objections because, while all the progress I'm personally perceiving is traceable to decisions different agent frameworks seem to be making, I'm totally open to the idea that model improvements have been instrumental in making these loops actually converge anywhere practical. I think near the core of my argument is simply the idea that we've crossed a threshold where current models plus these kinds of loops actually do work.

godelski

5 days ago

  > I don't know if any of this applies to the arguments

  > with access to ground truth

There's the connection. You think you have ground truth. No such thing exists

tptacek

5 days ago

It's even simpler than what 'rfrey said. You're here using "ground truth" in some kind of grand epistemic sense, and I simply mean "whether the exit code from a program was 1 or 0".

You can talk about how meaningful those exit codes and error messages are or aren't, but the point is that they are profoundly different than the information an LLM natively operates with, which are atomized weights predicting next tokens based on what an abstract notion of a correct line of code or an error message might look like. An LLM can (and will) lie to itself about what it is perceiving. An agent cannot; it's just 200 lines of Python, it literally can't.

godelski

5 days ago

  > You're here using "ground truth" in some kind of grand epistemic sense

I used the word "ground truth" because you did!

  >> in agent loops with access to ground truth about whether things compile and pass automatic acceptance.

Your critique about "my usage of ground truth" is the same critique I'm giving you about it! You really are doing a good job at making me feel like I'm going nuts...

  > the information an LLM natively operates with,

And do you actually know what this is?

I am a ML researcher you know. And one of those ones that keeps saying "you should learn the math." There's a reason for this, because it is really connected to what you're talking about here. They are opaque, but they sure aren't black boxes.

And it really sounds like you're thinking the "thinking" tokens are remotely representative of the internal processing. You're a daily HN user, I'm pretty sure you saw this one[0].

I'm not saying anything OpenAI hasn't[1]. I just recognize that this applies to more than a very specific narrow case...

[0] https://news.ycombinator.com/item?id=44074111

[1] https://cdn.openai.com/pdf/34f2ada6-870f-4c26-9790-fd8def563...

tptacek

5 days ago

Right, I'm just saying, I meant something else by the term than you did. Again: my point is, the math of the LLM doesn't matter to the point I'm making. It's not the model figuring out whether the code actually compiled. It's 200 lines of almost straight-line Python code that has cracked the elusive computer science problem of running an executable and checking the exit code.

godelski

3 days ago

  > the math of the LLM doesn't matter to the point I'm making.

The point I'm making is that to make effective use out of a tool, you should know what the tool can and can't do. Really the "all models are wrong, but some models are useful" paradigm. To know which models are useful you have to know how your models are wrong.

Sure, you can blindly trust too. But that can get pretty dangerous. While most of the time we leverage high levels of trust, I'm unconvinced our models allow us to trust them. Without being able to strongly demonstrate that they do not optimize tricking us (in our domains of interest) then they should be treated as distrustful, not trustful.

tptacek

3 days ago

The part of the tool that I'm "blindly trusting" is the part any competent programmer can reason about.

rfrey

5 days ago

Yes it does. Ground truth is what 30 years of experience says constitutes mergeable code. Ground truth doesn't mean "perfect, provably correct code", it means whatever your best benchmark for acceptable code is.

In medical AI, where I'm currently working, "ground truth" is usually whatever human experts say about a medical image, and is rarely perfect. The goal is always to do better than whatever the current ground truth is.

godelski

5 days ago

I understand how you interpreted my comment as this. That's my bad.

But even when taking state of the art knowledge as a ground truth aligning to that is incredibly hard. Medicine is a great example. You're trying to create a causal graph in a highly noisy environment. You ask 10 doctors and you'll get 12 diagnoses. The problem is subtle things become incredibly important. Which is exactly what makes measurements so fucking hard. There is no state of the art in a well defined sense.

The point is that in most domains this is how things are. Even in programming.

Getting the right answer isn't enough

tuhlatte

5 days ago

Those links mostly discuss the original RLHF used to train e.g. ChatGPT 3.5. Current paradigms are shifting towards RLVR (reinforcement learning with verifiable rewards), which absolutely can optimize good programs.

You can definitely still run into some of the problems eluded to in the first link. Think hacking unit tests, deception, etc -- but the bar is less "create a perfect RL environment" than "create an RL environment where solving the problem is easier than reward hacking." It might be possible to exploit a bug in the Lean 4 proof assistant to prove a mathematical statement, but I suspect it will usually be easier for an LLM to just write a correct proof. Current RL environments aren't as watertight as Lean 4, but there's certainly work to make them more watertight.

This is in no way a "solved" problem, but I do see it as a counter to your assertion that "This isn't a thing RL can fix." RL is powerful.

godelski

5 days ago

  > Current paradigms are shifting towards RLVR, which absolutely can optimize good programs

I think you've misunderstood. RL is great. Hell, RLHF has done a lot of good. I'm not saying LLM are useless.

But no, it's much more complex than you claim. RLVM can optimize for correct answers in the narrow domains where there are correct answers but it can't optimize good programs. There's a big difference.

You're right that Lean, Coq, and other ATPs can prove mathematical statements, but they also don't ensure that a program is good. There's frequently an infinite number of proofs that are correct, but most of those are terrible proofs.

This is the same problem all the coding benchmarks face. Even if the LLM isn't cheating, testing isn't enough. If it was we'd never do code review lol. I can pass a test with an algorithm that's O(n^3) despite there being an O(1) solution.

You're right that it makes it better, but it doesn't fix the underlying problem I'm discussing.

Not everything is verifiable.

Verifiability isn't enough.

If you'd like to prove me wrong in the former you're going to need to demonstrate that there are provably true statements to lots of things. I'm not expecting you to defy my namesake, nor will I ask you prove correctness and solve the related halting problem.

You can't prove an image is high fidelity. You can't prove a song sounds good. You can't prove a poem is a poem. You can't prove this sentence is English. The world is messy as fuck and most things are highly subjective.

But the problem isn't binary, it is continuous. I said we're using Justice Potter optimization, you can't even define what porn is. These definitions change over time, often rapidly!

You're forgetting about the tyrannical of metrics. Metrics are great, powerful tools that are incredibly useful. But if you think they're perfectly aligned with what you intend to measure then they become tools that work against you. Goodhart's Law. Metrics only work as guides. They're no different than any other powerful tool, if you use it wrong you get hurt.

If you really want to understand this I really encourage you to deep dive into this stuff. You need to get into the math. Into the weeds. You'll find a lot of help with metamathematics (i.e. my namesake), metaphysics (Ian Hacking is a good start), and such. It isn't enough to know the math, you need to know what the math means.

tuhlatte

5 days ago

The question at hand was whether LLMs could be trained to write good code. I took this to mean "good code within the domain of software engineering," not "good code within the universe of possible programs." If you interpreted it to mean the latter, so be it -- though I'm skeptical of the usefulness of this interpretation.

If the former, I still think that the vast majority of production software has metrics/unit tests that could be attached and subsequently hillclimbed via RL. Whether the resulting optimized programs would be considered "good" depends on your definition of "good." I suspect mine is more utilitarian than yours (as even after some thought I can't conceive of what a "terrible" proof might look like), but I am skeptical that your code review will prove to be a better measure of goodness than a broad suite of unit tests/verifiers/metrics -- which, to my original last point, are only getting more robust! And if these aren't enough, I suspect the addition of LLM-as-a-judge (potentially ensembles) checking for readability/maintainability/security vulnerabilities will eventually put code quality above that of what currently qualifies as "good" code.

Your examples of tasks that can't easily be optimized (image fidelity, song quality, etc.) seem out of scope to me -- can you point to categories of extant software that could not be hillclimbed via RL? Or is this just a fundamental disagreement about what it means for software to be "good"? At any rate, I think we can agree that the original claim that "The LLM has one job, to make code that looks plausible. That's it. There's no logic gone into writing that bit of code" is wrong in the context of RL.

godelski

5 days ago

  > I took this to mean "good code within the domain of software engineering," not "good code within the universe of possible programs.

We both mean the same thing. The reasonable one. The only one that even kinda makes sense: good enough code

  > vast majority of production software has metrics/unit tests that could be attached and subsequently hillclimbed via RL

Yes, hill climbed. But that's different than "towards good"

Here's the difference[0]. You'll find another name for Goodhart's Law in any intro ML course. Which is why it is so baffling that 1) this is contentious 2) it is the status quo in research now

Your metrics are only useful if you understand them

Your measures are only as good as your attention

And it is important to distinguish metrics from measures. They are different things. Both are proxies

  > Your examples of tasks that can't easily be optimized (image fidelity, song quality, etc.) seem out of scope to me

Maybe you're unfamiliar with diffusion models?[1]

They are examples where it is hopefully clearer that these things are hard to define. If you have good programming skills you should be able to make the connection back to what this has to do with my point. If not, I'm actually fairly confident GPT will be able to do so. There's more than enough in its training data to do that.

[0] https://en.wikipedia.org/wiki/Goodhart%27s_law

[1] https://stability.ai/

tuhlatte

5 days ago

Now I'm confused -- you're claiming you meant "good enough code" when your previous definition was such that even mathematical proofs could be "terrible"? That doesn't make sense to me. In software engineering, "good enough" has reasonably clear criteria: passes tests, performs adequately, follows conventions, etc. While these are imperfect proxies, they're sufficient for most real-world applications, and crucially -- measurable. And my claim is that they will be more than adequate to get LLMs to produce good code.

And again, diffusion models aren't relevant here. The original comment was about LLMs producing buggy code -- not RL's general limitations in other domains. Diffusion models' tensors aren't written by hand.

godelski

4 days ago

  > Now I'm confused ... that even mathematical proofs could be "terrible"? That doesn't make sense to me.

You know there's plenty of ways to prove things, right? Like there's not a single proof. Here's a few proofs for pi being irrational[0]. The list is not comprehensive.

Take that like you do with code. They all generate the same final output. They're all correct. But is one better than another? Yes, yes it is. But which one that is depends on context.

  > and crucially -- measurable

This is probably a point of contention. Measuring is far more difficult than people think. A lot of work goes into creating measurements and we get a nice ruler at the end. The problem isn't just that initial complexity, it is that every measure is a proxy. Even your meter stick doesn't measure a meter. What distinguishes the engineer from the hobbyist is the knowledge of alignment.

  How well does my measure align with what I intend to measure?

That's a very hard problem. How often do you ask yourself that? I'm betting not enough. Frankly, most things aren't measurable.

[0] https://proofwiki.org/wiki/Pi_is_Irrational#:~:text=Hence%20...

mindwok

5 days ago

This is just semantics. What's the difference between a "human interpretation of a good program" and a "good program" when we (humans) are the ones using it? If the model can write code that passes tests, and meets my requirements, then it's a good programmer. I would expect nothing more or less out of a human programmer.

otabdeveloper4

5 days ago

> What's the difference between a "human interpretation of a good program" and a "good program" when we (humans) are the ones using it?

Correctness.

> and meets my requirements

It can't do that. "My requirements" wasn't part of the training set.

mindwok

5 days ago

"Correctness" in what sense? It sounds like it's being expanded to an abstract academic definition here. For practical purposes, correct means whatever the person using it deems to be correct.

> It can't do that. "My requirements" wasn't part of the training set.

Neither are mine, the art of building these models is that they are generalisable enough that they can tackle tasks that aren't in their dataset. They have proven, at least for some classes of tasks, they can do exactly that.

godelski

5 days ago

  > to an abstract academic definition here

Besides the fact that your statement is self contradicting, there is actually a solid definition [0]. You should click the link on specification too. Or better yet, go talk to one of those guys that did their PhD in programming languages.

  > They have proven

Have they?

Or did you just assume?

Yeah, I know they got good scores on those benchmarks but did you look at the benchmarks? Look at the question and look what is required to pass it. Then take a moment and think. For the love of God, take a moment and think about how you can pass those tests. Don't just take a pass at face value and move on. If you do, well I got a bridge to sell you.

[0] https://en.wikipedia.org/wiki/Correctness_(computer_science)

mindwok

5 days ago

Sure,

> In theoretical computer science, an algorithm is correct with respect to a specification if it behaves as specified.

"As specified" here being the key phrase. This is defined however you want, and ranges from a person saying "yep, behaves as specified", to a formal proof. Modern language language models are trained under RL for both sides of this spectrum, from "Hey man looks good", to formal theorem proving. See https://arxiv.org/html/2502.08908v1.

So I'll return to my original point: LLMs are not just generating outputs that look plausible, they are generating outputs that satisfy (or at least attempt to satisfy) lots of different objectives across a wide range of requirements. They are explicitly trained to do this.

So while you argue over the semantics of "correctness", the rest of us will be building stuff with LLMs that is actually useful and fun.

godelski

5 days ago

You have to actually read more than the first line of a Wikipedia article to understand it

  > formal theorem proving

You're using Coq and Lean?

I'm actually not convinced you read the paper. It doesn't have anything to do with your argument. Someone using LLMs with formal verification systems is wildly different than LLMs being formal verification systems.

This really can't work if you don't read your own sources

otabdeveloper4

5 days ago

> they are generating outputs that satisfy (or at least attempt to satisfy) lots of different objectives across a wide range of requirements

No they aren't. You were lied to by the hype machine industry. Sorry.

The good news is that there's a lot of formerly intractable problems that can now be solved by generating plausible output. Programming is just not one of them.

mindwok

5 days ago

> No they aren't. You were lied to by the hype machine industry. Sorry.

Ok. My own empirical evidence is in favour of these things being useful, and useful enough to sell their output (partly), but I'll keep in mind that I'm being lied to.

otabdeveloper4

5 days ago

Quite a huge leap from "these things are useful" to "these things can code".

(And yes, this leap is the lie you're being sold. "LLMs are kinda useful" is not what led to the LLM trillion dollar hype bubble.)

mindwok

5 days ago

The thing I'm using them for is coding though...

godelski

5 days ago

Is your grandma qualified to determine what is good code?

  > If the model can write code that passes tests

You think tests make code good? Oh my sweet summer child. TDD has been tried many times and each time it failed worse than the last.

pydry

5 days ago

Good to know something i've been doing for 10 years consistently could never work.

godelski

5 days ago

It's okay, lots of people's code is always buggy. I know people that suck at coding and have been doing it for 50 years. It's not uncommon

I'm not saying don't make tests. But I am saying you're not omniscient. Until you are, your tests are going to be incomplete. They are helpful guides, but they should not drive development. If you really think you can test for every bug then I suggest you apply to be Secretary for health.

https://hackernoon.com/test-driven-development-is-fundamenta...

https://geometrian.com/projects/blog/test_driven_development...

Dylan16807

5 days ago

> It's okay, lots of people's code is always buggy. I know people that suck at coding and have been doing it for 50 years. It's not uncommon

Are you saying you're better than that? If you think you're next to perfect then I understand why you're so against the idea that an imperfect LLM could still generate pretty good code. But also you're wrong if you think you're next to perfect.

If you're not being super haughty, then I don't understand your complaints against LLMs. You seem to be arguing they're not useful because they make mistakes. But humans make mistakes while being useful. If the rate is below some line, isn't the output still good?

pydry

5 days ago

Ive worked with people who write tests afterwards on production code and it's pretty inevitable that they:

* End up missing tests for edge cases they built and forgot about. Those edge cases often have bugs.

* They forget and cover the same edge cases twice if theyre being thorough with test-after. This is a waste.

* They usually end up spending almost as much time manually testing in the end to verify the code change they just made worked whereas I would typically just deploy straight to prod.

It doesnt prevent all bugs it just prevents enough to make the teams around us who dont do it look bad by comparison even though they do manual checks too.

Ive heard loads of good reasons to not write tests at all, Ive yet to hear a good reason to not write one before if you are going to write one.

Both of your articles raise pretty typical straw men. One is "what if im not sure what the customer wants?" (thats fine but i hope you arent writing production code at this point) and the other is the peculiar but common notion that TDD can only be done with a low level unit test which is dangerous bullshit.

godelski

5 days ago

Sure, you work with some bad programmers. Don't we all?

The average driver thinks they're above average. The same is true about programmers.

I do disagree a bit with the post and think you should write tests while developing. Honestly, I don't think they'll disagree. I believe they're talking about a task rather than the whole program. Frankly, no program is ever finished so in that case you'd never write tests lol.

I believe this because they start off saying it wasn't much code.

But you are missing the point. From the first link

  > | when the tests all pass, you’re done
  > Every TDD advocate I have ever met has repeated this verbatim, with the same hollow-eyed conviction.

These aren't strawmen. These are questions you need to constantly be asking yourself. The only way to write good code is to doubt yourself. To second guess. Because that's what drives writing better tests.

I actually don't think you disagree. You seem to perfectly understand that tests (just like any other measure) are guides, not answers. That there's much more to this than passing tests.

But the second D in TDD is what's the problem. Tests shouldn't drive development, they are just part of development. The engineer writing tests at the end is inefficient, but the engineer that writes tests at the beginning is arrogant. To think you can figure it out before writing the code is laughable. Maybe some high level broad tests are feasible but that's only going to be a very small portion.

You can do hypothesis driven development, but people will call you a perfectionist and say you're going to slow. By HDD I mean you ask "what needs to happen, how would I know that is happening?" Which very well might involve creating tests. Any scientist is familiar with this but also familiar with its limits

pydry

5 days ago

TDD is not a panacea, it's an effective, pragmatic practice with several benefits and little to no downsides compared to test after.

Im not sure what you're saying, really but I dont think it disagrees with this central point in any specific way.

otabdeveloper4

5 days ago

"Good" is the context of LLMs means "plausible". Not "correct".

If you can't code then the distinction is lost on you, but in fact the "correct" part is why programmers get paid. If "plausible" were good enough then the profession of programmer wouldn't exist.

mindwok

5 days ago

Not necessarily. If the RL objective is passing tests then in the context of LLMs it means "correct", or at least "correct based on the tests".

otabdeveloper4

5 days ago

Unfortunately that doesn't solve the problem in any way. We don't have an Oracle machine for testing software.

If we did, we could autogenerate code even without an LLM.

cma

5 days ago

They are also trained with RL to write code to pass unit tests and Claude does have a big problem with trying to cheat the test or request pretty quickly after running into issues, making manual edit approval more important. It usually still tells what it is trying to do wrong so you can often find out from its summary before having to scan the diff.

meowface

5 days ago

This can happen, but in practice, given I'm reviewing every line anyway, it almost never bites me.

capnrefsmmat

6 days ago

The argument seems to be that for an expert programmer, who is capable of reading and understanding AI agent code output and merging it into a codebase, AI agents are great.

Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?

The expert skills needed to be an editor -- reading code, understanding its implications, knowing what approaches are likely to cause problems, recognizing patterns that can be refactored, knowing where likely problems lie and how to test them, holding a complex codebase in memory and knowing where to find things -- currently come from long experience writing code.

But a novice who outsources their thinking to an LLM or an agent (or both) will never develop those skills on their own. So where will the experts come from?

I think of this because of my job as a professor; many of the homework assignments we use to develop thinking skills are now obsolete because LLMs can do them, permitting the students to pass without thinking. Perhaps there is another way to develop the skills, but I don't know what it is, and in the mean time I'm not sure how novices will learn to become experts.

gwbas1c

6 days ago

> Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?

Well, if everyone uses a calculator, how do we learn math?

Basically, force students to do it by hand long enough that they understand the essentials. Introduce LLMs at a point similar to when you allow students to use a calculator.

palata

6 days ago

> Well, if everyone uses a calculator, how do we learn math?

Calculators have made most people a lot worse in arithmetic. Many people, for instance, don't even grasp what a "30%" discount is. I mean other than "it's a discount" and "it's a bigger discount than 20% and lower than 40%". I have seen examples where people don't grasp that 30% is roughly one third. It's just a discount, they trust it.

GPS navigation has made most people a lot worse at reading maps or generally knowing where they are. I have multiple examples where I would say something like "well we need to go west, it's late in the day so the sun will show us west" and people would just not believe me. Or where someone would follow their GPS on their smartphone around a building to come back 10m behind where they started, without even realising that the GPS was making them walk the long way around the building.

Not sure the calculator is a good example to say "tools don't make people worse with the core knowledge".

jen729w

5 days ago

GPS has also ruined our city-level spatial awareness.

Before, you had the map. So you were aware that Fitzroy was to the west of Collingwood and both were south of Clifton Hill and so on. I had dozens of these suburbs roughly mapped out in my mind.

Driving down an unfamiliar road, one could use signs to these suburbs as a guide. I might not know exactly where I was, but I had enough of an idea to point me in the right direction.

That skill has disappeared.

squirrel55583

4 days ago

>Before, you had the map. So you were aware that Fitzroy was to the west of Collingwood and both were south of Clifton Hill and so on. I had dozens of these suburbs roughly mapped out in my mind.

>Driving down an unfamiliar road, one could use signs to these suburbs as a guide. I might not know exactly where I was, but I had enough of an idea to point me in the right direction.

Reading those sentences feels like I am dreaming. The exploration... The possibilities... Serendipitously finding you way through and getting temporarily lost at night in a big friendly suburban area with trees and in summer...

quantified

5 days ago

Apparently 1/3 lb hamburgers didn't help A&W against McDonald's because too many people thought 1/3 is smaller than 1/4. So the Quarter Pounder remains supreme. Snopes: [https://www.snopes.com/news/2022/06/17/third-pound-burger-fr...]

rainsford

5 days ago

But how important is the core knowledge if it isn't necessary to achieve the outcomes people actually value? People only cared about map reading skills to the extent that it got them where they want to go. Once GPS became a thing, especially GPS on mobile phones, getting them where they want to go via map reading became irrelevant. Yes, there are corner cases where map reading or general direction finding skills are useful, but GPS does a vastly better and quicker job in the large majority of cases so our general way-finding experience has improved.

This is especially true because the general past alternative to using GPS to find some new unfamiliar place wasn't "read a map" it was "don't go there in favor of going some place you already knew" in a lot of cases. I remember the pre-GPS era, and my experience in finding new stuff is significantly better today than it was back then.

j_w

5 days ago

Using map reading skills as a proxy for this is a bit of a strawman. People who use GPS habitually have worse navigational and spatial awareness skills.

https://www.nature.com/articles/s41598-020-62877-0

If you habitually use a calculator for all arithmetic, could the result not be similar? What if you reach to an LLM for all your coding, general research, etc.? These tools may vastly speed up some workflows, but your brain is a muscle.

palata

5 days ago

I think you're missing the point, which is to say "those tools make us more productive, but less knowledgeable".

And you answer by saying "it's okay to be less knowledgeable (and hence depend on the tool), as long as you are more productive". Which is a different question.

But to me it's obviously not desirable: if AI allows people to completely lose all sense of critical thinking, I think it's extremely dangerous. Because whoever controls the AI controls those people. And right now, look at the techbros who control the AIs.

So the original question is: is it the case that AI reduces the skills of the people who use them? The calculator and the GPS are examples given to suggest that it doesn't sound unlikely.

ethbr1

5 days ago

At the end of the day, it's the average productivity across a population that matters.

So GPS makes people worse at orienteering -- on average, does it get everyone where they need to go, better / faster / easier?

Sometimes, the answer is admittedly no. Google + Facebook + TikTok certainly made us less informed when they cannibalized reporting (news media origination) without creating a replacement.

But on average, I'd say calculators did make the population more mathematically productive.

After all, lots of people sucked at math before them too.

palata

5 days ago

> After all, lots of people sucked at math before them too.

A calculator doesn't do maths, it does arithmetic. People sucked at maths, but I'm pretty sure they were better with arithmetic.

> At the end of the day, it's the average productivity across a population that matters.

You're pushing my example. My point is that AI may actually make the average developer worse. Sure, also more productive. So it will reinforce this trend that has been in the software industry for more than a decade: produce more but worse software.

Productivity explains why we do it. It doesn't mean it is desirable.

ethbr1

5 days ago

You're looking at this from a human-centric perspective.

I'm suggesting you consider it from an objective perspective.

It's easily possible for an organization to be more productive with worse developers because of the tools they have access to.

And no, that's not some slight of verbal hand in measuring "productive" -- they are able to ship more value, faster.

fzeroracer

5 days ago

> And no, that's not some slight of verbal hand in measuring "productive" -- they are able to ship more value, faster.

Ship more value faster is exactly a verbal slight of hand. That's the statement used by every bad product manager and finance asshole to advocate for shipping out broken code faster. It's more value because more code is more content, but without some form of quality guard rails you run into situations where everything breaks. I've been on teams just like that where suddenly everything collapses and people get mad.

ethbr1

5 days ago

Do you think compilers helped teams ship more value faster from worse developers? IDEs with autocomplete? Linters?

At the end of the day, coders are being paid money to produce something.

It's not art -- it's a machine that works and does a thing.

We can do that in ways that create a greater or lesser maintenance burden, but it's still functional.

LLM coding tools detractors are manufacturing reasons to avoid using another tool that helps them write code.

They need to get over the misconception of what the job is. As another comment previously quipped 'If you want to write artisanal, hand-tuned assembly that's beautiful, do that on your own time for a hobby project.'

fzeroracer

5 days ago

> Do you think compilers helped teams ship more value faster from worse developers? IDEs with autocomplete? Linters?

I'm tired of engaging with this false equivalence so I won't. Deterministic systems are not the same.

> It's not art -- it's a machine that works and does a thing.

That's right. But what you need to understand is that the machines we create can and do actively harm people. Leaking secure information, creating software that breaks systems and takes down critical infrastructure. We are engineers first and foremost and artists second. And that means designing systems to be robust and safe. If you can't understand that then you shouldn't be an engineer and should kindly fuck off.

ethbr1

5 days ago

As an engineer, appreciate that in the span of a paragraph you got yourself from talking about something technical... to telling someone to fuck off.

Even more humorously, because you seem to think I'm making an argument that isn't in anything I wrote. (LLM to/in prod)

palata

5 days ago

There is a big difference with compilers. With compilers, the developer still needs to write every single line of code. There is a clear an unambiguous contract between the source code and what gets executed (if it's ambiguous, it is a bug).

The thread here was talking about:

> Well, if everyone uses a calculator, how do we learn math?

The question being whether or not AI will make developers worse at understanding what their code is doing. You can say that "it's okay if a website fails every 100 times, the user will just refresh and we're still more profitable". But wouldn't you agree that such a website is objectively of worse quality? It's cheaper, for sure.

Said differently: would you fly in a plane for which the autopilot was vibe coded? If not, it tells you something about the quality of the code.

Do we always want better code? I don't know. What I see is that the trend is enshittification: more profit, worse products. I don't want that.

ethbr1

5 days ago

> [With compilers] There is a clear an unambiguous contract between the source code and what gets executed

Debatable in practice. You can't tell me you believe most developers understand what their compiler is doing, to a level of unambiguity.

Whether something gets unrolled, vectorized, or NOP-padded is mysterious. Hell, even memory management is mysterious in VM-based languages now.

And yes (to the inevitable follow-up) still deterministic, but those are things that developers used to have to know, now they don't, and the world keeps spinning.

> You can say that "it's okay if a website fails every 100 times, the user will just refresh and we're still more profitable". But wouldn't you agree that such a website is objectively of worse quality? It's cheaper, for sure.

I would say that's the reality we've been living in since ~2005. How often SaaS products have bugs? How frequently mobile apps ship a broken feature?

There are two components here: (1) value/utility & (2) cost/time.

There are many websites out there that can easily take a 1 in 100 error rate and still be useful.

But! If such a website, by dint of its shitty design, can be built with 1/100th of the resources (or 100x websites can be built with the same), then that might be a broader win.

Not every piece of code needs to fly in space or run nuclear reactors. (Some does! And it should always have much higher standards)

> Said differently: would you fly in a plane for which the autopilot was vibe coded? If not, it tells you something about the quality of the code.

I flew in a Boeing 737 MAX. To the above, that's a domain that should have called for higher software standards, but based on the incident rate I had no issue doing so.

> Do we always want better code? I don't know. What I see is that the trend is enshittification: more profit, worse products. I don't want that.

The ultimate tradeoff is between (expensive/less, better code) and (cheaper/more, worse code).

If everything takes a minimum amount of cost/effort, then some things will never be built. If that minimum cost/effort decreases, then they can be.

You and I are of like mind regarding enshittification and declining software/product standards, but I don't think standing in front of the technological advancement train is going to slow it.

If a thing can be built more cheaply, someone will do it. And then competitors will be forced to cheapen their product as well.

Imho, the better way to fight enshittification is creating business models that reward quality (and scale).

palata

5 days ago

> You and I are of like mind regarding enshittification and declining software/product standards, but I don't think standing in front of the technological advancement train is going to slow it.

Note that I'm well aware that I won't change anything. I'm really just saying that AI will help the trend of making most software become worse. It sucks, but that's how it is :-).

But yeah I think we agree on many points here.

ethbr1

5 days ago

The glass half-full would be that effective AI coding tools (read: more competent than a minimal cost human) may actually improve average software quality!

Suppose it depends on how quickly the generative effectiveness improves.

palata

5 days ago

> I'm suggesting you consider it from an objective perspective.

What is objective? That profitability is good? We're destroying our environment to the point where many of us will die from it for the sake of profitability. We're over-using limited natural resources for the sake of profitability. In my book that's not desirable at all.

Companies are profit-maximising machines. The path to more profitability tends to be enshittification: the company makes more money by making it worse for everybody. AI most definitely requires more resources and it seems like those resources will be used to do more, but of lower quality.

Surely that's profitable. But I don't think it is desirable.

ethbr1

5 days ago

Responded to you down below, but will say here that profit is a two-sided coin.

Decreasing the cost of production can lead to more profit or cheaper product prices.

I 100% agree that the trend of quality has been downwards in many things over the past 20 years or so.

monkeyelite

5 days ago

> it's the average productivity across a population that matters.

That’s not clear to me at all.

istjohn

5 days ago

I'm unconvinced that calculators have made most people a lot worse in arithmetic. There have always been people who are bad at math. It's likely there are fewer people who can quickly perform long division on paper, but it's also possible the average person is _more_ numerate because they can play around with a calculator and quickly build intuition.

palata

5 days ago

> I'm unconvinced that calculators have made most people a lot worse in arithmetic. There have always been people who are bad at math.

Arithmetic is a subset of maths.

MangoToupe

5 days ago

Arithmetic is also near-useless if you have access to a calculator. It's also a completely different skill thab reasoning about numbers, which is a very useful skill.

nitwit005

5 days ago

But, logically, you need to spend time thinking about numbers to be good reasoning about them, and the calculator is about reducing that time.

I feel there's a bit of a paradox, with many subjects, where we all know the basics are the absolute most important thing, but when we see the basics taught in the real world, it seems insultingly trivial.

MangoToupe

5 days ago

I understand what you're saying, but I legitimately am unconvinced learning long division is necessary to learn by hand to master division. If anything, perhaps we should be asking children to derive arithmetic from use of a calculator.

SoftTalker

5 days ago

I think it’s pretty hard to reason about numbers without having mastered arithmetic. Or at least beat your brain against it long enough that you understand the concepts even if you don’t have all the facts memorized.

MangoToupe

5 days ago

I disagree; i think the focus on arithmetic actually enables people saying they're "bad at math" when symbolic reasoning is a completely different (and arguably much easier) skill. You an easily learn algebra without knowing long division.

Hell, if I had to do long division today without a computer I'd have to re-derive it.

palata

5 days ago

I don't think it's so much about doing a long division. To me, it's more about having an intuition that 30/100 is roughly "one third", and that you can put three thirds in the full thing.

And I don't mean specifically those numbers, obviously. Same goes with 20/100, or understanding orders of magnitudes, etc.

Many people will solve a "maths problem" with their calculator, end up with a result that says that "the frog is moving at 21km/s" and not realise that it doesn't make any sense. "Well I applied the recipe, the calculator gave me this number, I assume this number is correct".

It's not only arithmetic of course, but it's part of it. Some kind of basic intuition about maths. Just look at what people were saying during Covid. I have heard so many people say completely wrong stuff because they just don't have a clue when they see a graph. And then they vote.

MangoToupe

5 days ago

Fair point, I concede that maybe i'm overly-optimistic at even basic intuition for basic figures and concepts.

SoftTalker

5 days ago

I agree you can learn algebra without knowing (or being good at) long division on paper, but you need to have a good conceptual understanding of what division is and I don't think a lot of people get that without the rote process of doing it over and over in elementary school.

recursive

5 days ago

I can do plenty of arithmetic much faster than I could type it on a calculator keypad. That's like saying hardware keyboards are near-useless if you have access to a touchscreen.

MangoToupe

5 days ago

I just don't really work much with arithmetic to begin with. Almost 100% of the numerical work I do is symbolic.

recursive

5 days ago

All well and good, but don't mistake a practice you don't use as being generally useless.

MangoToupe

5 days ago

Similarly, don't mistake a practice you do engage in as necessary!

palata

5 days ago

Would you be able to do your numerical work without understanding what an addition or a subtraction is?

I feel like arithmetic is part of the basics to build abstraction. If I say "y = 3x + a", somewhere I have to understand what "3 times x" means and what the "+" means, right?

Or are you saying that you can teach someone to do advanced maths without having a clue about arithmetic?

recursive

2 days ago

You can have a solid understanding of "+" without being able to quickly add three digit numbers.

creakingstairs

5 days ago

Sure there have always been people bad at math. But basic arithmetic is not really math. We used to drill it into kids but we no longer do so and I can usually see the difference between generations. For example, women in my mother’s generation were not prioritised for education but they often are pretty quick at arithmetic. But kids and young adults I come across pull out their phones for basic additions and divisions. And I find myself pulling out my phone more and more often.

I mean it’s not the end of the world and as you’ve said the raw number of people of numerate people are rising thanks to technology. But technology also seem to rob people of motivation to learn somewhat useful skills and even more so with LLMs.

techpineapple

5 days ago

"it's also possible the average person is _more_ numerate because they can play around with a calculator and quickly build intuition."

This is not actually possible.

MangoToupe

5 days ago

Why? I am honestly befuddled with this response. manually computing arithmetic is not necessary to reason about numbers.

devmor

5 days ago

> manually computing arithmetic is not necessary to reason about numbers

It absolutely is. If you can't add or subtract, what reasoning are you doing that is worthwhile?

MangoToupe

5 days ago

You can grasp the concept without executing the mechanics. I don't see how that's difficult to grasp.

palata

5 days ago

> I don't see how that's difficult to grasp.

Maybe that's because you actually can do arithmetic, to the point where it's difficult for you to see how it would be if you couldn't?

MangoToupe

5 days ago

I can't do arithmetic more than a couple digits, though, that's what the python console is for.

palata

4 days ago

For instance, you can certainly say that 381/7 is a positive number. And if I say "381/7 = 198", you can easily say that it is clearly wrong, e.g. because you immediately see that ~200 is roughly half of ~400, so it cannot be anywhere close to 1/7th.

I believe that this is an acquired skill that requires basic arithmetic. But if you need a calculator to realise that 381 is roughly twice as big as 198, then you can't do any of the reasoning above.

One may say "yeah but the point of the calculator is to not have to do the reasoning above", but I disagree. In life, we don't go around with a calculator trying to find links between stuff, like "there are 17 trees in this street, 30 cars, what happens if I do 17+30? Or 30-17? Or 30*17?". But if you have some intuition about numbers, you can often make more informed decisions ("I need to wait in one of those lines for the airport security check. This line is twice as long but is divided between three officers at the end, whereas this short line goes to only one officer. Which one is likely to be faster?").

MangoToupe

4 days ago

I see what you're saying, but I just don't care that much about numbers to draw any conclusions you did about the figure you presented. I just see a string of digits.

palata

4 days ago

> I just see a string of digits.

And you go to the shorter line, even if it's slower? :-)

devmor

5 days ago

Try standing in line at a grocery store and listening to people get upset because the amount is much higher than they thought it would be. You will hear statements like "But how is it $43? I didn't buy anything that costs more than $5"

People that failed to grasp arithmetic cannot reason about numbers to a useful degree.

MangoToupe

5 days ago

> People that failed to grasp arithmetic cannot reason about numbers to a useful degree.

I think you're extrapolating far too much from such a simple interaction, which doesn't imply anything about ability to reason about numbers, just their ability to compute addition. If you say "if a is larger than b, and b is larger than c, is a larger than c?", you're testing numerical reasoning ability.

devmor

4 days ago

Not compute addition - understand that addition is a function of numbers conceptually.

About 30% of US adults do not have the basic ability to conceptualize the relationship between whole numbers.

nsonha

5 days ago

You're confusing maths with accounting. 30% is an intuition/familiarity, not knowledge.

palata

5 days ago

I'm not confused. A calculator does arithmetic, not maths. The question was:

> Well, if everyone uses a calculator, how do we learn math?

Which doesn't make much sense, because a calculator doesn't do maths. So I answered the question that does make sense: if everyone uses a calculator, do we still learn arithmetic? And I believe we don't.

And then, if we suck at basic arithmetic, it makes it harder to be good at maths.

fizx

5 days ago

Writing has made people worse at memorization. This argument has been around since Plato.

monkeyelite

5 days ago

That’s also true though. In oral societies they memorize a lot more.

presentation

5 days ago

But somehow I was born in the age of GPS and yet I ended up with a strong mental map and navigation skills.

I suspect there will be plenty of people who grow up in the age of LLMs and maybe by reading so much generated code, or just coding things themselves for practice, will not have a hard time learning solid coding skills. It may be easy to generate slop, but it’s also easy to access high quality guidance.

username223

5 days ago

If calculators were unreliable... Well, we'd be screwed if everyone blindly trusted them and never learned math.

They'd also be a whole lot less useful. Calculators are great because they always do exactly what you tell them. It's the same with compilers, almost: imagine if your C compiler did the right thing 99.9% of the time, but would make inexplicable errors 0.1% of the time, even on code that had previously worked correctly. And then CPython worked 99.9% of the time, except it was compiled by a C compiler working 99.9% of the time, ...

But bringing it back on-topic, in a world where software is AI-generated, and tests are AI-generated (because they're repetitive, and QA is low-status), and user complaints are all fielded by chat-bots (because that's cheaper than outsourcing), I don't see how anyone develops any expertise, or how things keep working.

gwbas1c

4 days ago

Early calculators were unreliable. Assume that AI based coding will improve.

sarchertech

4 days ago

What early calculators were unreliable?

I’m unaware of any, but even if there is an example it was before they were in widespread use.

mmasu

6 days ago

While I agree with your suggestion, the comparison does not hold: calculators do not tell you which numbers to input and compute. With an LLM you can just ask vaguely, and get an often passable result

gwbas1c

4 days ago

Then figure out how to structure the assignment to make students show their work. If a student doesn't understand the concept, it will show in how they prompt AI.

For example, you could require that students submit all logs of AI conversations, and show all changes they made to the code produced.

IE, yesterday I asked ChatGPT how to add a copy to clipboard button in MudBlazor. It told me the button didn't exist, and then wrote the component for me. That saved me a bunch of research; but I needed to refactor the code for various reasons.

So, if this was for an assignment, I could turn in both my log from ChatGPT, and then show the changes I made to the code ChatGPT provided.

dimal

6 days ago

> a novice who outsources their thinking to an LLM or an agent (or both) will never develop those skills on their own. So where will the experts come from?

Well, if you’re a novice, don’t do that. I learn things from LLMs all the time. I get them to solve a problem that I’m pretty sure can be solved using some API that I’m only vaguely aware of, and when they solve it, I read the code so I can understand it. Then, almost always, I pick it apart and refactor it.

Hell, just yesterday I was curious about how signals work under the hood, so I had an LLM give me a simple example, then we picked it apart. These things can be amazing tutors if you’re curious. I’m insatiably curious, so I’m learning a lot.

Junior engineers should not vibe code. They should use LLMs as pair programmers to learn. If they don’t, that’s on them. Is it a dicey situation? Yeah. But there’s no turning back the clock. This is the world we have. They still have a path if they want it and have curiosity.

capnrefsmmat

6 days ago

> Well, if you’re a novice, don’t do that.

I agree, and it sounds like you're getting great results, but they're all going to do it. Ask anyone who grades their homework.

Heck, it's even common among expert users. Here's a study that interviewed scientists who use LLMs to assist with tasks in their research: https://doi.org/10.1145/3706598.3713668

Only a few interviewees said they read the code through to verify it does what they intend. The most common strategy was to just run the code and see if it appears to do the right thing, then declare victory. Scientific codebases rarely have unit tests, so this was purely a visual inspection of output, not any kind of verification.

dartharva

5 days ago

> Junior engineers should not vibe code. They should use LLMs as pair programmers to learn. If they don’t, that’s on them. Is it a dicey situation? Yeah. But there’s no turning back the clock. This is the world we have. They still have a path if they want it and have curiosity.

Except it's impossible to follow your curiosity when everything in the world is pushing against it (unless you are already financially independent and only programming for fun). Junior developers compete in one of the most brutal labor markets in the world, and their deliverables are more about getting things done on time than doing things better. What they "should" do goes out the window once you step out of privilege and look at the real choices.

curt15

5 days ago

You sound like an active learner who could become a top programmer even without LLMs. Most students will take the path of least resistance.

simonw

5 days ago

There is absolutely a thing where self-motivated autodidacts can benefit massively more from these new tools than people who prefer structured education.

dartharva

5 days ago

Paradoxically, those self-motivated autodidacts will have to be without the stress and pressure of delivering things on time, and thus get largely limited to recreational programmers who don't have as much skin in the game in the first place.

dimal

5 days ago

Trust me, I am under enormous stress and pressure right now, more than at any other time in my life. I’m not someone sitting on a mountaintop, free of all cares. I’m someone trapped in a box that’s sinking under the waves, desperately trying to find a way to escape.

I approach problems with curiosity because I know that this is the only way I’ll find a way to survive and thrive again.

CobrastanJorji

5 days ago

This reminds me of Isaac Asimov's "Profession" short story. Most people receive their ability (and their matching assigned profession, thus the name) from a computer. They then are able to do the necessary tasks for their job, but they can't advance the art in any way. A few people aren't compatible with this technology, and they instead learn to do things themselves, which is fortunate because it's the only way to advance the arts.

mgraczyk

6 days ago

Deliberate practice, which may take a form different from productive work.

I believe it's important for students to learn how to write data structures at some point. Red black trees, various heaps, etc. Students should write and understand these, even though almost nobody will ever implement one on the job.

Analogously electrical engineers learn how to use conservation laws and Ohm's law to compute various circuit properties. Professionals use simulation software for this most of the time, but learning the inner workings is important for students.

The same pattern is true of LLMs. Students should learn how to write code, but soon the code will write itself and professionals will be prompting models instead. In 5-10 years none of this will matter though because the models will do nearly everything.

capnrefsmmat

6 days ago

I agree with all of this. But it's already very difficult to do even in a college setting -- to force students to get deliberate practice, without outsourcing their thinking to an LLM, you need various draconian measures.

And for many professions, true expertise only comes after years on the job, building on the foundation created by the college degree. If students graduate and immediately start using LLMs for everything, I don't know how they will progress from novice graduate to expert, unless they have the self-discipline to keep getting deliberate practice. (And that will be hard when everyone's telling them they're an idiot for not just using the LLM for everything)

tsimionescu

6 days ago

You're talking about students, but the question was about seniors. You don't go to school to become a senior dev, you code in real-world settings, with real business pressures, for a decade or two to become a senior. The question is how are decent students supposed to grow into seniors who can independently evaluate AI-produced code if they are forced to use the magic box and accept its results before being able to understand them?

mgraczyk

6 days ago

I was talking about students because I was replying to a comment from a professor talking about his students

NeutralCrane

5 days ago

> Question: If everyone uses AI to code, how does someone become an expert capable of carefully reading and understanding code and acting as an editor to an AI?

LLMs are very much like pair programmers in my experience. For the junior engineer, they are excellent resources for learning, the way a senior engineer might be. Not only can they code what the junior can’t, they can explain questions the junior has about the code and why it’s doing what it’s doing.

For senior devs, it is a competent pair programmers, acting as an excellent resource for bouncing ideas off of, rubber ducking, writing boilerplate, and conducting code reviews.

For expert devs, it is a junior/senior dev you can offload all the trivial tasks to so you can focus on the 10% of the project that is difficult enough to require your expertise. Like a junior dev, you will need to verify what it puts together, but it’s still a huge amount of time saved.

For junior devs specifically, if they are not curious and have no interest in actually learning, they will just stop at the generated code and call it a day. That’s not an issue with the tool, it’s an issue with the dev. For competent individuals with a desire to learn and grow, LLMs represent one of the single best resources to do so. In that sense, I think that junior devs are at a greater advantage than ever before.

xg15

5 days ago

> That’s not an issue with the tool, it’s an issue with the dev.

Hard disagreeing here. It's a difference to work on a task because you feel it brings you tangible progress or because it's an artificial exercise that you could really do with one sentence to Claude if it weren't for the constraints of the learning environment. This feeling is actually demotivating for learning.

r3trohack3r

6 days ago

I don’t know about you, but I use LLMs as gateways to knowledge. I can set a deep research agent free on the internet with context about my current experience level, preferred learning format (books), what I’m trying to ramp up on, etc. A little while later, I have a collection of the definitive books for ramping up in a space. I then sit down and work through the book doing active recall and practice as I go. And I have the LLM there for Q&A while I work through concepts and “test the boundaries” of my mental models.

I’ve become faster at the novice -> experienced arc with LLMs, even in domains that I have absolutely no prior experience with.

But yeah, the people who just use LLMs for “magic oracle please tell me what do” are absolutely cooked. You can lead a horse to water, but you can’t make it drink.

andersa

6 days ago

If no one really becomes an expert anymore, that seems like great news for the people who are already experts. Perhaps people actively desire this.

stackskipton

6 days ago

Problem is, at some point those experts retire or change their focus and you end up with COBOL problem.

Except instead of just one language on enterprise systems no one wants to learn because there is no money in them, it's everything.

andersa

6 days ago

That seems like even better news for the people about to be paid large sums to fix all that stuff because no one else knows how any of it works.

ofjcihen

6 days ago

It’s a great point and one I’ve wondered myself.

Arguments are made consistently about how this can replace interns or juniors directly. Others say LLMs can help them learn to code.

Maybe, but not on your codebase or product and not with a seniors knowledge of pitfalls.

I wonder if this will be programmings iPhone moment where we start seeing a lack of deep knowledge needed to troubleshoot. I can tell you that we’re already seeing a glut of security issues being explained by devs as “I asked copilot if it was secure and it said it was fine so I committed it”.

tpmoney

5 days ago

> I can tell you that we’re already seeing a glut of security issues being explained by devs as “I asked copilot if it was secure and it said it was fine so I committed it”.

And as with Google and Stack Overflow before, the Sr Devs will smack the wrists of the Jr's that commit untested and unverified code, or said Jr's will learn not to do those things when they're woken up at 2 AM for an outage.

remich

5 days ago

That's assuming the business still employs those Sr Devs so they can do the wrist smacking.

To be clear, I think any business that dumps experienced devs in favor of cheaper vibe-coding mids and juniors would be making a foolish mistake, but something being foolish has rarely stopped business types from trying.

xg15

5 days ago

The way the responses to this subthread show the classical "the problem doesn't exist - ok, it does exist but it's not a big deal - ok, it is a big deal but we should just adapt to it" progression makes me wonder if we found one of the few actually genuine objections to LLM coding.

1970-01-01

6 days ago

Nail on head. Before, innovations in code were extensions of a human's capabilities. The LLM-driven generation could diminish the very essence of writing meaningful code, to the point where they will live in the opposite of a golden era. The dead internet theory may yet prevail.

killerstorm

6 days ago

I think a large fraction of my programming skills come from looking through open source code bases. E.g. I'd download some code and spend some time navigating through files looking for something specific, e.g. "how is X implemented?", "what do I need to change to add Y?".

I think it works a bit like pre-training: to find what you want quickly you need to have a model of coding process, i.e. why certain files were put into certain directories, etc.

I don't think this process is incompatible with LLM use...

jedberg

6 days ago

If I were a professor, I would make my homework start the same -- here is a problem to solve.

But instead of asking for just working code, I would create a small wrapper for a popular AI. I would insist that the student use my wrapper to create the code. They must instruct the AI how to fix any non-working code until it works. Then they have to tell my wrapper to submit the code to my annotator. Then they have to annotate every line of code as to why it is there and what it is doing.

Why my wrapper? So that you can prevent them from asking it to generate the comments, and so that you know that they had to formulate the prompts themselves.

They will still be forced to understand the code.

Then double the number of problems, because with the AI they should be 2x as productive. :)

capnrefsmmat

5 days ago

For introductory problems, the kind we use to get students to understand a concept for the first time, the AI would likely (nearly) nail it on the first try. They wouldn't have to fix any non-working code. And annotating the code likely doesn't serve the same pedagogical purpose as writing it yourself.

Students emerge from lectures with a bunch of vague, partly contradictory, partly incorrect ideas in their head. They generally aren't aware of this and think the lecture "made sense." Then they start the homework and find they must translate those vague ideas into extremely precise code so the computer can do it -- forcing them to realize they do not understand, and forcing them to make the vague understanding concrete.

If they ask an AI to write the code for them, they don't do that. Annotating has some value, but it does not give them the experience of seeing their vague understanding run headlong into reality.

I'd expect the result to be more like what happens when you show demonstrations to students in physics classes. The demonstration is supposed to illustrate some physics concept, but studies measuring whether that improves student understanding have found no effect: https://doi.org/10.1119/1.1707018

What works is asking students to make a prediction of the demonstration's results first, then show them. Then they realize whether their understanding is right or wrong, and can ask questions to correct it.

Post-hoc rationalizing an LLM's code is like post-hoc rationalizing a physics demo. It does not test the students' internal understanding in the same way as writing the code, or predicting the results of a demo.

palata

6 days ago

> They will still be forced to understand the code.

But understanding is just one part of the learning process, isn't it? I assume everybody has had this feeling: the professor explains maths on the blackboard, and the student follows. The students "understands" all the steps: they make sense, they don't feel like asking a question right now. Then the professor gives them an exercise slightly different and asks to do the same, and the students are completely lost.

Learning is a loop: you need to accept it, get it in your memory (learn stuff by heart, be it just the vocabulary to express the concepts), understand it, then try to do it yourself. Realise that you missed many things in the process, and start at the beginning: learn new things by heart, understand more, try it again.

jedberg

6 days ago

That loop is still there. They have to get the AI to write the right code.

And beyond that, do they really need to understand how it works? I never learned how to calculate logarithms by hand, but I know what they are for and I know when to punch the button on the calculator.

I'll never be a top tier mathematician, but that's not my goal. My goal is to calculate things that require logs.

If they can get the AI to make working code and explain why it works, do they need to know more than that, unless they want to be top in their field?

palata

5 days ago

> If they can get the AI to make working code and explain why it works, do they need to know more than that, unless they want to be top in their field?

Making working code is the easy part. Making maintainable code is a completely different story.

And again, being able to explain why something works requires superficial knowledge. This is precisely why bugs pass through code reviews: it's hard to spot a bug by reading code that looks like it should work.

sanderjd

6 days ago

Yep, this is the thing I worry about as well.

I find these tools incredibly useful. But I constantly edit their output and frequently ask for changes to other peoples' code during review, some of which is AI generated.

But all of that editing and reviewing is informed by decades of writing code without these tools, and I don't know how I would have gotten the reps in without all that experience.

So I find myself bullish on this for myself and the experienced people I work with, but worried about training the next generation.

jeremyjh

5 days ago

Yes I feel the same way. But I worry about my kids. My 15-year old son wanted to go into software engineering and work for a game studio. I think I'll advocate civil engineering, but for someone who will still be working 50 years from now its really hard to know what will be a good field right now.

sanderjd

5 days ago

Yeah but in fairness, it's always true that it's hard to know what a good field will be in half a century.

rglover

6 days ago

> So where will the experts come from?

They won't, save for a relative minority of those who enjoy doing things the hard way or those who see an emerging market they can capitalize on (slop scrubbers).

I wrote this post [1] last month to share my concerns about this exact problem. It's not that using AI is bad necessarily (I do every day), but it disincentivizes real learning and competency. And once using AI is normalized to the point where true learning (not just outcome seeking) becomes optional, all hell will break loose.

> Perhaps there is another way to develop the skills

Like sticking a fork in a light socket, the only way to truly learn is to try it and see what happens.

[1] https://ryanglover.net/blog/chauffeur-knowledge-and-the-impe...

pchristensen

5 days ago

LLMs are also great to ask questions about existing code. It's like being able to converse with StackOverflow.

bawolff

6 days ago

I dont know if im convinced by this. Like if we were talking about novels, you don't have to be a writer to check grammar and analyze plot structure in a passable way. It is possible to learn by reading instead of doing.

capnrefsmmat

6 days ago

Sure, you could learn about grammar, plot structure, narrative style, etc. and become a reasonable novel critic. But imagine a novice who wants to learn to do this and has access to LLMs to answer any question about plots and style that they want. What should they do to become a good LLM-assisted author?

The answer to that question is very different from how to become an author before LLMs, and I'm not actually sure what the answer is. It's not "write lots of stories and get feedback", the conventional approach, but something new. And I doubt it's "have an LLM generate lots of stories for you", since you need more than that to develop the skill of understanding plot structures and making improvements.

So the point remains that there is a step of learning that we no longer know how to do.

AstroBen

5 days ago

I've had a lot of success using LLMs to deepen my understanding of topics. Give them an argument, and have them give the best points against it. Consider them, iterate. Argue against it and let it counter. It's a really good rubber duck

> The expert skills... currently come from long experience writing code

Do they? Is it the writing that's important? Or is it the thinking that goes along with it? What's stopping someone from going through LLM output, going back and forth on design decisions with the LLM, and ultimately making the final choice of how the tool should mold the codebase after seeing the options

I mean of course this requires some proactive effort on your part.. but it always did

The key point I think though is to not outsource your thinking. You can't blindly trust the output. It's a modern search engine

_tom_

5 days ago

I think it's the writing.

I learned long ago that I could read a book, study it, think about it. And I still would really master the material until I built with it.

kulahan

6 days ago

> If everyone uses AI to code, how does someone become an expert

The same way they do now that most code is being copied/integrated from StackOverflow.

groos

6 days ago

I had this conversation with a friend:

HIM: AI is going to take all entry level jobs soon. ME: So the next level one up will become entry level? HIM: Yes. ME: Inductively, this can continue up to the CEO. What about the CEO? HIM. Wait...

kulahan

5 days ago

I simply don’t believe all the jobs will go away; it feels much more like the field will just be significantly pared back. There will be more opportunities for juniors eventually if it turns out to be too high of a barrier to entry and elder programmers start to retire.

ldjkfkdsjnv

6 days ago

This is such a non issue and so far down the list of questions. Weve invented AI that can code, and you're asking about career progression? Thats the the top thing to talk about? Weve given life to essentially an alien life form

sho_hn

5 days ago

"What is this going to do to humans?" is probably the #1 question that should be on the mind of every engineer, every day. Being toolmakers for civilization is the entire point of our profession.

hiAndrewQuinn

6 days ago

I'll take the opposite view of most people. Expertise is a bad thing. We should embrace technological changes that render expertise economically irrelevant with open arms.

Take a domain like US taxation. You can certainly become an expert in that, and many people do. Is it a good thing that US taxes are so complicated that we have a market demand for thousands of such experts? Most people would say no.

Don't get my wronf, I've been coding for more years of being alive than I haven't by this point, I love the craft. I still think younger me would have far preferred a world where he could have just had GPT do it all for him so he didn't need to spend his lunch hours poring over the finer points of e.g. Python iterators.

TheOtherHobbes

6 days ago

By the same logic we should allow anyone with an LLM to design ships, bridges, and airliners.

Clearly, it would be very unwise to buy a bridge designed by an LLM.

It's part of a more general problem - the engineering expectations for software development are much lower than for other professions. If your AAA game crashes, people get annoyed but no one dies. If your air traffic control system fails, you - and a large number of other poeple - are going to have a bad day.

The industry that has a kind of glib unseriousness about engineering quality - not theoretical quality, based on rules of thumb like DRY or faddy practices, but measurable reliability metrics.

The concept of reliability metrics doesn't even figure in the LLM conversation.

That's a very bizarre place to be.

jacobgkau

6 days ago

> We should embrace technological changes that render expertise economically irrelevant with open arms.

To use your example, is using AI to file your taxes actually "rendering [tax] expertise economically irrelevant?" Or is it just papering over the over-complicated tax system?

From the perspective of someone with access to the AI tool, you've somewhat eased the burden. But you haven't actually solved the underlying problem (with the actual solution obviously being a simpler tax code). You have, on the other hand, added an extra dependency on top of an already over-complicated system.

layer8

6 days ago

In addition, a substantial portion of the complexity in software is essential complexity, not just accidental complexity that could be done away with.

Xss3

5 days ago

This. And most of the time the code isn't that complex either. The complexity of a software product often isn't in the code, it's in the solution as a whole, the why's of each decision, not the how.

However whenever I've faced actual hard tasks, things that require going off the beaten path the AI trains on, I've found it severely lacking, no matter how much or little context I give it, no matter how many new chats I make, it just won't veer into truly new territory.

hiAndrewQuinn

5 days ago

I never said anything about using AI to do your taxes.

I was drawing an analogy. We would probably be better off with a tax system that wasn't so complicated it creates its own specialized workforce. Similarly we would be better off with programming tools that make the task so simple that professional computer programmers feel like a 20th century anachronism. It might not be what we personally want as people who work in the field, but it's for the best.

jacobgkau

4 days ago

> I never said anything about using AI to do your taxes. I was drawing an analogy.

Yeah, I was using your analogy.

> It might not be what we personally want as people who work in the field, but it's for the best.

You're inventing a narrative and borderline making a strawman argument. I said nothing about what people who work in the field "personally want." I'm talking about complexity.

> Similarly we would be better off with programming tools that make the task so simple that professional computer programmers feel like a 20th century anachronism.

My point is that if the "tools that make the task simple" don't actually simplify what's happening in the background, but rather paper over it with additional complexity, then no, we would not "be better off" with that situation. An individual with access to an AI tool might feel that he's better off; anyone without access to those tools (now or in the future) would be screwed, and the underlying complexity may still create other (possibly unforeseen) problems as that ecosystem grows.

open592

6 days ago

The question then becomes whether or not it's possible (or will be possible) to effectively use these LLMs for coding without already being an expert. Right now, building anything remotely complicated with an LLM, without scouring over every line of code generated, is not possible.

triceratops

6 days ago

Counter-counter point. The existence of tools like this can allow the tax code to become even more complex.

hiAndrewQuinn

5 days ago

Nowhere do I suggest using AI to do your taxes. My point was, if you think it's bad taxes are complicated enough that many people need to hire a professional to do it, you should also think it's bad programming is complicated enough that many people need to hire a professional to do it.

layer8

6 days ago

I mean, we already have vibe tariffs, so vibe taxation isn’t far off. ;)

hooverd

6 days ago

Don't think of it from someone who had to learn. Think of it from someone who has never had the experience the friction of learning at all.

superb_dev

6 days ago

But that is incompatible with the fact that you need be an expert to wield this tool effectively.

rafavento

6 days ago

There's something that seems to be missing in all these posts and that aligns with my personal experience trying to use AI coding assistants.

I think in code.

To me, having to translate the into natural language for the LLM to translate it back into code makes very little sense.

Am I alone in this camp? What am I missing?

simonw

6 days ago

If you think in code, try prompting them in code.

I quite often prompt with code in a different language, or pseudo-code describing roughly what I am trying to achieve, or a Python function signature without the function body.

Or I will paste in a bunch of code I have already written with a comment somewhere that says "TODO: retrieve the information from the GitHub API" and have the model finish it for me.

hiq

6 days ago

> a Python function signature without the function body.

This, and for multiple functions that end up composing well together as per their signatures. Maybe there's one public function I want to document well, so I write the docstring myself, and it's the result for 3-4 other internal functions which I'd let the LLM implement.

The nice part is that even if the LLM fails, all that is not lost, as opposed to some weird spec I'd feed an LLM but that's too verbose for a human reader, or a series of prompts.

Seb-C

5 days ago

I'm also in this camp, and that's why it does not work for me.

Natural language is just a terrible interface and fundamentally not an appropriate one to communicate with a computer.

I wonder if I'm in the minority here because I'm neurodivergent.

lcrmorin

a day ago

No. Thanks for pointing this out. It took quite some time to understand and express this to my management. Also I use a very small set of libraries, with function names / parameters that are quite explicit. And I don't need to specify some style to bypass hidden prompts, and bad coding practices.

jansan

5 days ago

I am pretty sure we will see programming languages that are custom made for AI popping up soon. I cannot predict how it will look like, but it may be a mix between an extremely well documented language with lots of safeguards (like Kotlin or Java) combined with natural language like instructions.

9rx

5 days ago

So, basically, you envision that we will eventually add a type system to English?

ACCount36

6 days ago

I know what you mean. The thing is: if you already have the solution put together in your mind, it might be faster to just implement it by hand.

But if you don't have the shape of a solution? Might be faster to have an AI find it. And then either accept AI's solution as is, or work off it.

nostrademons

6 days ago

Curious how he reconciles this:

> If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style.

with Joel Spolsky's fundamental maxim:

> It’s harder to read code than to write it.

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

pyb

5 days ago

I'm sure no one reads LLM-produced code very carefully, especially if it "works". At best they skim.

jansan

5 days ago

The problem is that a lot of code works in general, but fails in edge cases. I would hate to be the guy who's job is only to find out why verbose AI generated code fails in one particular condition.

swalsh

5 days ago

I read LLM generated code like I review a PR. I skim for anything that stands out as a common pitfall, and dig into the details of area I expect issues.

furyofantares

5 days ago

For most things I'm not willing to accept faster code at the expense of being an expert in the code.

So I am still trying to find the right amount of reading, editing, and reworking that gets the job done faster, where "the job" includes me being an expert in the produced code, not just the production of code.

There are periods of skimming but I'm doing a lot more than skimming.

lawn

5 days ago

This is my biggest problem with LLM generated code.

But I'm sure the author will simply dismiss it as a skill issue too.

skwirl

5 days ago

The major difference is that in the type of reading Joel Splosky is talking about, you are coming in not knowing the code's intent. It was written by one or more other people at some point in the past, likely with many iterative changes over a period of time. Figuring out the intent in this case is 90%+ of the work. With LLM generated code, you know the intent. You just told the assistant exactly what your intent was. It's much, much easier to read code that you already know the intent of.

layer8

6 days ago

You can invest more time in the writing to make code easier to read. It’s just that it typically would take much more time. It would be conceivable for LLMs to eventually reach a state where they can produce such easily readable and comprehensible code.

simonw

5 days ago

I can reconcile that. Joel is right. So is Thomas. If you're a professional programmer you should be able to do the harder thing.

Thomas was a bit more spicy about that:

> but you have no idea what the code is - Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?

danShumway

4 days ago

You've said this a couple of times, but I don't understand what you're trying to say.

Yes, I can do frustrating things, I know how to review and vet code. I also know how to write boilerplate code. I also know how to research new tasks in areas where I have no familiarity that are poorly documented. I know how to do a lot of complicated, difficult things - all of which are part of being a professional developer.

The question is whether I want to use a tool that makes most of my job the difficult part.

The only way this makes sense is if you are somehow reaching a point where "doing the harder thing" is not the harder thing for you anymore - where reviewing code is easier for you than writing it. And maybe your argument is that you can get to that point with practice, or that LLM code is generally easier to review than other types of code, or that the problems it's tackling are so difficult that in those cases writing the code is harder than reading it.

But it's not that they're both true. "You should be able to do the harder thing" is not really an answer to "why are you selling me a tool that replaces an easy thing with a hard thing?"

There are many difficult things that I can do as a professional software developer. I can mentor junior developers. I can do detailed product design work with stakeholders and translate technical limitations into language that they understand. I can negotiate software deadlines with product owners. I can write interfaces for undocumented software and deal with undocumented bugs in 3rd-party code. I can step through minified code in production settings to debug problems. These are all difficult things that, as a professional developer, I am capable and willing to do, and often need to do. And yes, of course, I can review pull requests. I am not, however, generally in the business of adopting tools that force me to do that stuff more often than is necessary to get good results. I don't adopt tools that make my life harder, and I honestly think that's a big part of being a professional.

To be very blunt about it: "Buck up, you should be able to handle this" is not a sales pitch. I can also write with my non-dominant hand, but I'm not going to start taking notes that way. There's no intrinsic prize for making your life harder, the only way what you're saying makes sense is if it's not harder for you to read code than to write it.

simonw

4 days ago

I've kind of lost track of where we disagree here, to be honest.

Maybe we need to drop "easier" and "harder" and talk about speed.

I can write software faster with LLMs, without sacrificing quality - in fact I can get higher quality software because doing things "better" doesn't mean they take longer.

I derive enjoyment from building good stuff. If I can do it faster I can build more of it, which increases my enjoyment.

I wrote about this a couple of years ago: "AI-enhanced development makes me more ambitious with my projects": https://simonwillison.net/2023/Mar/27/ai-enhanced-developmen...

That's still true today, only more so because the models are significantly more competent than they were in March 2023!

danShumway

4 days ago

I think it's great if reviewing code for you is faster than writing it.

I don't think reviewing code well is something most developers can do faster than they can write it. I think that's what Joel is getting at when he says that understanding code is harder than writing it. Harder in the sense of, it takes more effort and takes longer and you are more likely to make errors.

And that might not be true for you. But it is true for a huge number of professional developers.

And it is certainly not the case that understanding and reviewing code is both:

- more time consuming and more difficult than writing it and

- that it's faster to move your entire development strategy to one where you review existing code.

Those are incompatible claims. Pick one.

----

I wouldn't normally quibble about something like this, but it does kind of rub me the wrong way when I hear AI developers talk down about this (and I'm sure it's not your intention to talk down). In your post, you write:

> Figuring out how to patch fetch() like that is non-trivial—where I’m using the software engineer’s definition of “non-trivial” as meaning “I don’t know how to do that off the top of my head”. I’d have to do some thinking and research! I’m already tempted to drop this idea and work on something else.

If I responded to that by writing, "well, doing quick thinking and research is part of the job of being a professional developer and is a learned skill that you could get better at, so what's your problem" - I think you would very rightly say that's not a reasonable response.

So I think that "well, you're a professional, you should be faster at reviewing code" is similarly dismissive to a real conflict inherent in these tools that you are ignoring, and is the kind of dismissive response that I don't normally see from you. Especially phrasing it as, "they're both true".. I don't understand what that even means.

They're not both true, you're telling me right now that it's not both true - you are telling me that it is faster for you to digest code than it is for you to write it. So what is this "both are true" bullcrap?

amonks

4 days ago

I'm not Simon, but I think reviewing code is both faster than writing code and more difficult than writing code.

Lots of difficult things don't take very much time: shooting a bullseye, lifting something heavy, winning a round of geoguessr, playing the guitar solo from Peg. We don't call these things difficult because they take a lot of time in the moment, but because they take a lot of time to master.

I think reading code is like that too. When I think about the best code readers/reviewers I've worked with, they are (1) also among the best and fastest code writers I know, and (2) still much faster at reviewing code than they are at writing it.

bccdee

5 days ago

To quote an excellent article from last week:

> The AI has suggested a solution, but the added code is arguably useless or wrong. There is a huge decision space to consider, but the AI tool has picked one set of decisions, without any rationale for this decision.

> [...]

> Programming is about lots of decisions, large and small. Architecture decisions. Data validation decisions. Button color decisions.

> Some decisions are inconsequential and can be safely outsourced. There is indeed a ton of boilerplate involved in software development, and writing boilerplate-heavy code involves near zero decisions.

> But other decisions do matter.

(from https://lukasatkinson.de/2025/net-negative-cursor/)

Proponents of AI coding often talk about boilerplate as if that's what we spend most of our time on, but boilerplate is a cinch. You copy/paste, change a few fields, and maybe run a macro on it. Or you abstract it away entirely. As for the "agent" thing, typing git fetch, git commit, git rebase takes up even less of my time than boilerplate.

Most of what we write is not highly creative, but it is load-bearing, and it's full of choices. Most of our time is spent making those choices, not typing out the words. The problem isn't hallucination, it's the plain bad code that I'm going to have to rewrite. Why not just write it right myself the first time? People say "it's like a junior developer," but do they have any idea how much time I've spent trying to coax junior developers into doing things the right way rather than just doing them myself? I don't want to waste time mentoring my tools.

tptacek

5 days ago

No, what's happening here is that you're using a different definition of "boilerplate" than the adopters are using. To you, "boilerplate" is literally a chunk of code you copy and paste to repeatedly solve a problem (btw: I flip my shit when people do this on codebases I work on). To them, "boilerplate" represents a common set of rote solutions to isomorphic problems. The actual lines of code might be quite different, but the approach is the same. That's not necessarily something you can copy-paste.

Coming at this from a computer-science or PLT perspective, this idea of an "abstract, repeatable meta-boilerplate" is exactly the payoff we expect from language features like strong type systems. Part of the point of rigorous languages is to create these kinds of patterns. You had total expressiveness back in assembly language! Repeatable rigor is most of the point of modern languages.

skydhash

5 days ago

> To them, "boilerplate" represents a common set of rote solutions to isomorphic problems.

That's what libraries and frameworks are here for. And that's why no experienced engineers consider those an issue. What's truly important is the business logic, then you find a set of libraries that solves the common use cases and you write the rest. Sometimes you're in some novel space that doesn't have libraries (new programming language), but you still have specs and reference implementation that helps you out.

The actual boilerplate is when you have to write code twice because the language ecosystem don't have good macros à la lisp so you can invent some metastuff for the problem at end. (think writing routers for express.js)

tptacek

5 days ago

People keep saying this. LLMs (not even agents; LLMs themselves, intrinsically) use frameworks. They're quite good at them. Frameworks make programs more legible to LLMs, not less.

danShumway

5 days ago

> LLMs (not even agents; LLMs themselves, intrinsically) use frameworks.

That's not what I see the parent comment saying. They're not saying that LLMs can't use frameworks, they're saying that if you have rote solutions that you are being forced to write over and over and over again, you shouldn't be using an LLM to automate it, you should use a framework and get that code out of your project.

And at that point, you won't have a ton of boilerplate to write.

The two sides to this I see online are between the people who think we need a way to automate boilerplate and setup code, and the people who want to eliminate boilerplate (not just the copy-paste kind, but also the "ugh, I've got to do this thing again that I've done 20 times" kind).

Ideally:

> a common set of rote solutions to isomorphic problems

Should not be a thing you have to write very often (or if it is, you should have tools that make it as quick to implement as it would be to type a prompt into an LLM). If that kind of rote repetitive problem solving is a huge part of your job, then to borrow your phrasing: the language or the tools you're using have let you down.

theshrike79

5 days ago

LLMs are _really_ good at React for example, just because there's so much of it everywhere for them to learn from.

jazzyjackson

5 days ago

Copy pasting code that could be abstracted is not a usage of boilerplate I've ever encountered, usually it's just a reference to certain verbose languages where you have to write a bunch of repetitive low-entropy stuff to get anywhere, like getters and setters in java classes.

bccdee

5 days ago

Getters and setters definitely fall into "copy/paste + macro" territory for me. Just copy/paste your field list and run a macro that turns each field into a getter and setter. Or use an IDE shortcut. Lombok obviates all this anyway of course.

samtheprogram

5 days ago

lol shadcn

bccdee

5 days ago

> The actual lines of code might be quite different, but the approach is the same. That's not necessarily something you can copy-paste.

Assuming something like "a REST endpoint which takes a few request parameters, makes a DB query, and returns the response" fits what you're describing, you can absolutely copy/paste a similar endpoint, change the parameters and the database query, and rename a couple variables—all of which takes a matter of moments.

Naturally code that is being copy-pasted wholesale with few changes is ripe to be abstracted away, but patterns are still going to show up no matter what.

theshrike79

5 days ago

But the LLM will write that pretty much instantly after you've given it one example to extrapolate from.

It'll even write basic unit tests for your CRUD API while it's at it.

bccdee

5 days ago

Sure—but I can also write it pretty instantly, with some judicious copy/pasting.

And the less instantly I can write it, the more petty nuances there are to deal with—things like non-trivial validation, a new database query function, a header that I need to access—the more ways an LLM will get it subtly wrong.

If I treat it as more than a fancy autocomplete, I have to spend all my time cleaning up after it. And if I do treat it as fancy autocomplete, it doesn't save that much time over judicious copy/pasting.

frutiger

5 days ago

> solutions to isomorphic problems

“Isomorphic” is a word that describes a mapping (or a transformation) that preserves some properties that we believe to be important.

The word you’re looking for is probably “similar” not “isomorphic”. It sure as hell doesn’t sound as fancy though.

tptacek

5 days ago

... yes, that is why I chose the word? Literally: preservation of structural similarity. Not simply "similarity", which could mean anything.

frutiger

5 days ago

What’s the morphism here?

foobar10000

5 days ago

It used to be the library one wrote to solve them at once - the solution being isomorphic that is, not the problems. I.e. given problems a b c d, f(a) solves a, g(b) solves b, … j(d) solves d - turns out if you can write a common function F instead of f g h j separately, then clearly F is isomorphic to f g h j under our constraints.

I guess in this case the morphism is the similar or same prompt to generate f g h j.

tedious-coder

5 days ago

But what do you make of the parent’s second paragraph? This is the way I feel as well - I would rather not spend my time asking AI to do something right that I could just do myself.

I bit the bullet last week and tried to force myself to use a solution built end to end by AI. By the time I’d finished asking it to make changes (about 25 in total), I would’ve had a much nicer time doing it myself.

The thing in question was admittedly partially specified. It was a yaml-based testing tool for running some scenarios involving load tests before and after injecting some faults in the application. I gave it the yaml schema up front, and it did a sensible job as a first pass. But then I was in the position of reading what it wrote, seeing some implicit requirements I’d not specified, and asking for those.

Had I written it myself from the start, those implicit requirements would’ve been more natural to think about in the progression of iterating on the tool. But in this workflow, I just couldn’t get in a flow state - the process felt very unnatural, not unlike how it would’ve been to ask a junior to do it and taking 25 rounds of code review. And that has always been a miserable task, difficult to force oneself to stay engaged with. By the end I was much happier making manual tweaks and wish I’d have written it myself from the start.

Aeolun

5 days ago

I'm firmly convinced at this point that there is just no arguing with the haters. At the same time, it feels like this transition is as inevitable as the transition to mobile phones. LLM's are everywhere, and there's no escaping it no matter how much you might want to.

There's always some people that will resist to the bitter end, but I expect them to be few and far between.

bccdee

5 days ago

It's not really a matter of wanting to escape them. I've used them. They're not good for writing code. They feel zippy, but you have to constantly clean up after them. It's as slow and frustrating as trying to walk a junior engineer through a problem they don't fully understand. I'd rather do it myself.

If the AI agent future is so inevitable, then why do people waste so much oxygen insisting upon its inevitability? Just wait for it in silence. It certainly isn't here yet.

mwcampbell

5 days ago

> there's no escaping it no matter how much you might want to

And if we accept that inevitability, it becomes a self-fulfilling prophecy. The fact that some people _want_ us to give in is a reason to keep resisting.

GenshoTikamura

5 days ago

There is absolutely no inevitability as long as there is a willingness to contemplate what is happening.

superconduct123

5 days ago

Your article comes across like you think every developer is the exact same as you, very egocentric piece

Not everyone is just cranking out hacked together MVPs for startups

Do you not realize there are many many other fields and domains of programming?

Not everyone has the same use case as you

Uehreka

5 days ago

> Not everyone is just cranking out hacked together MVPs for startups

Now here’s the fun part: In a really restrictive enterprise environment where you’ve got unit tests with 85% code coverage requirements, linters and static typing, these AI programming assistants actually perform even better than they do when given a more “greenfield” MVP-ish assignment with lots of room for misinterpretation. The constant “slamming into guardrails” keeps them from hallucinating and causes them to correct themselves when they do.

The more annoying boxes your job makes you tick, the more parts of the process that make you go “ugh, right, that”, the more AI programming assistants can help you.

BobbyTables2

5 days ago

Unfortunately high code coverage is misaligned with high quality code.

If one copy-pastes a routine to make a modified version (that’s used), code coverage goes UP. Sounds like a win win for many…

Later, someone consolidates the two near identical routines during a proper refactoring. They can even add unit tests. Guess what? Code coverage goes DOWN!

Sure, having untested un-executed code is a truly horrible thing. But focusing on coverage can be worse…

godelski

5 days ago

Didn't we quickly turn against TDD after it became unavoidably obvious that it doesn't work?

What is which this new paradigm where we act like everything is easily measurable and every measure is perfectly aligned with what we want to measure. We know these things aren't possible. It doesn't take much thought to verify this. Do you believe you are smart enough to test all possible issues? No one is. There'd be no CVEs if you could and we'd have solved all of physics centuries ago

WWLink

5 days ago

Trash Driven Development :D

hatthew

5 days ago

A human might see a gap in guardrails and avoid it, or upon seeing unexpected behavior they might be able to tell that a guardrail was breached and have some intuition of where. An LLM will happy burst through a gap in the guardrails, claim it has solved the problem, and require just as much human effort to fix and even more long-term maintenance because of less code familiarity.

tptacek

5 days ago

An LLM, which has deep advantages over humans in rote pattern recognition, will see gaps in guardrails that humans miss. In the end, it's a wash. What's not a wash is all the rest of the instrumentation and formal methods stuff an LLM agent enables, all of which is available to every dev team today but doesn't get used because type safety, semgrep rules, advanced linters, and TLA+ specs are too time-consuming for humans to deploy. Not a problem for an LLM agent.

heyts

5 days ago

I’m not entirely sure this is the silver lining you seem to think it is

usef-

5 days ago

Which part of his article is specific to his use case? It all looks fairly general to me.

nostrebored

5 days ago

And not everyone working in things that aren’t “hacked together MVPs” has your experience. You can read any number of reports about code generated at FAANG, incident response tooling that gets to RCA faster, etc.

There are obviously still things it can’t do. But the gap between “I haven’t been able to get a tool to work” and “you’re wrong about the tool being useful” is large.

sensanaty

5 days ago

> ...code generated ar FAANG...

Oh, like these set of PRs on the dotnet repo? https://news.ycombinator.com/item?id=44050152

RodgerTheGreat

5 days ago

At least when you mentor an actual junior developer they often learn, and you can take satisfaction in aiding the growth of a human being. All the time and effort spent coaxing an LLM to "do better" either disappears in a puff of smoke the next time it goes schizoid and needs to have its context cleared or- at best- is recorded to help a for-profit company train their next generation of products.

Like everything else about the "GenAI" fad, it boils down to extractively exploiting goodwill and despoiling the commons in order to convert VC dollars into penny-shavings.

d--b

5 days ago

Boilerplate is a cinch when you already know what to do.

I work in finance, I have for almost 20 years now. There are things in finance you do once every 5 years, like setting up a data source like Bloomberg in a new programming language. Now you know from the last time you did it that it’s a pain, you need to use a very low level api, handling all the tiny messages yourself, building up the response as it comes from the source in unordered packets. It’s asynchronous, there is a message queue, and what I specialize in is maths.

Now I could spend hours reading documents, putting crap together, and finally come up with some half baked code that ignores most possible error points.

Or I could use ChatGPT and leverage the fact that hundreds of implementations of the same module exist out there. And make something that just works.

That is the first ever coding question I asked an LLM and it literally saved me days of trial and error for something where my added value is next to zero.

Similarly I use LLMs a lot for small tasks that are in fact fairly difficult, and that don’t add any value to the solution. Things like converting data structures in an efficient way using Python idioms, or JavaScript 2023 features, that there is no way I can keep up with.

closeparen

5 days ago

The thing that makes an agent special is making some kind of effort to gather the relevant context before generating. The quality of the edits from the "agent" panel in Cursor/Copilot/etc is quite a bit higher than the type-ahead suggestions or the inline prompt stuff.

Bizarrely though, it seems to be limited to grep for the moment, doesn't work with LSP yet.

shlant

5 days ago

OP: https://fly.io/blog/youre-all-nuts/#but-its-bad-at-rust

> (from https://lukasatkinson.de/2025/net-negative-cursor/)

looks inside

complaining about Rust code

godelski

5 days ago

Plus, looks like it just hard coded vales. I see this happen a lot with AI code. Even when I try to get it to not it still tends to do it.

Issues like that are simple and just create debt. Sure, it "works" now but who writes code not knowing that we're going to change things next week or next month. It's the whole reason we use objects and functions in the first place!

bccdee

5 days ago

Yeah, only in Rust is the maximum value of an unsigned 16-bit integer 65535.

These aren't Rust-specific syntax foibles. It's not a borrow-checker mistake or anything. These are basic CS fundamentals that it's thoughtlessly fumbling.

phanimahesh

5 days ago

TBH the critique is completely valid when cursor advertised shitty code on their homepage.

chotmat

5 days ago

the rust code in question is the example on the cursor landing page though

godelski

5 days ago

The current image on the landing page might be even worse. It just updates Message to MessageV1. Why would you pay money for what's a string replacement?

The comment on the right says it'll help the user with protocol versioning. This is not how you do that...

monkaiju

5 days ago

Adding this comment to my HN bookmarks! Well said

naasking

5 days ago

> Most of what we write is not highly creative, but it is load-bearing, and it's full of choices.

The idea that you can't specify the load bearing pillars of your structure to the AI, or that it couldn't figure them out by specifying the right requirements/constraints, will not age well.

bccdee

5 days ago

> The idea that you can't specify the load bearing pillars of your structure to the AI

But English is a subjective and fuzzy language, and the AI typically can't intuit the more subtle points of what you need. In my experience a model's output always needs further prompting. If only there were a formal, rigorous language to express business logic in! Some sort of "programming language."

naasking

5 days ago

> But English is a subjective and fuzzy language, and the AI typically can't intuit the more subtle points of what you need.

I disagree on the "can't". LLMs seem no better or worse than humans at making assumptions when given a description of needs, which shouldn't be surprising since they infer such things from examples of humans doing the same thing. In principle, there's nothing preventing a targeted programming system from asking clarifying questions.

> In my experience a model's output always needs further prompting.

Yes, and the early days of all tooling were crude. Don't underestimate the march of progress.

whattheheckheck

5 days ago

What have you written with ai that ha made you or your business money

JumpCrisscross

5 days ago

> What have you written with ai that ha made you or your business money

I use R a little more than I should, given the simplicity of my work. Claude writes better R quicker than I can. I double check what it's doing. But it's easier to double check it used twang correctly than spend five trying to remember how to use the weird package that does propensity scoring [1].

I'm sure data analysis will still sort of be a thing. But it's just not as useful anymore in the form of a human being for most commercial applications at sub-enterprise scale.

[1] https://cran.r-project.org/web/packages/twang/index.html

TheRoque

6 days ago

One of the biggest anti LLM arguments for me at the moments is about security. In case you don't know, if you open a file with copilot active or cursor, containing secrets, it might be sent to a server a thus get leaked. The companies say that if that file is in a cursorignore file, it won't be indexed, but it's still a critical security issue IMO. We all know what happened with the "smart home assistants" like Alexa.

Sure, there might be a way to change your workflow and never ever open a secret file with those editors, but my point is that a software that sends your data without your consent, and without giving you the tools to audit it, is a no go for many companies, including mine.

zmmmmm

5 days ago

It's why I use Aider, because it only operates on explicit files that you give it. Works great with OpenAI but if you are really worried, it interfaces perfectly with Ollama for local LLMs. A 12b model on my Mac does well enough for coding that it's serviceable for me.

MYEUHD

5 days ago

Which 12b model are you running?

zmmmmm

5 days ago

Gemma 12b quantized (gemma3:12b-it-qat in ollama)

metaltyphoon

5 days ago

At day job while someone was trying out windsurf, it simply picked up an environment variable that contained sensitive data and used it in code. This is wild.

noobermin

5 days ago

Sometimes I wonder if all the hype around being left behind and needing to try these things just to see how great they can be is being deliberately put out there to increase their training data.

Too many vibe coders contribute trash code if any. They need more code from so called experts that isn't open source yet.

knallfrosch

5 days ago

It's pretty unlikely someone at Cursor cares about accessing your Spring Boot project on GitHub through your personal access token – because they already have all your code.

tjhorner

5 days ago

I don't think that's the threat model here. The concern is regarding potentially sensitive information being sent to a third-party system without being able to audit which information is actually sent or what is done with it.

So, for example, if your local `.env` is inadvertently sent to Cursor and it's persisted on their end (which you can't verify one way or the other), an attacker targeting Cursor's infrastructure could potentially compromise it.

nickysielicki

5 days ago

You write your secrets to disk?

danielbln

5 days ago

Having a gitignored .env file is a pretty common pattern.

nickysielicki

5 days ago

If you say so. I’d never do that.

root_axis

6 days ago

The problem with LLMs for code is that they are still way too slow and expensive to be generally practical for non-trivial software projects. I'm not saying that they aren't useful, they are excellent at filling out narrow code units that don't require a lot of context and can be quickly or automatically verified to be correct. You will save a lot of time using them this way.

On the other hand, if you slip up and give it too much to chew on or just roll bad RNG, it will spin itself into a loop attempting many variations of crap, erasing and trying over, but never actually coming closer to a correct solution, eventually repeating obviously incorrect solutions over and over again that should have been precluded based on feedback from the previous failed solutions. If you're using a SOTA model, you can easily rack up $5 or more on a single task if you give it more than 30 minutes of leeway to work it out. Sure, you could use a cheaper model, but all that does is make the fundamental problem worse - i.e. you're spending money but not actually getting any closer to completed work.

Yes, the models are getting smarter and more efficient, but we're still at least a decade away from being able to run useful models at practical speeds locally. Aggressively quantized 70b models simply can't cut it, and even then, you need something like 10k tps to start building LLM tools that can overcome the LLM's lack of reasoning skills through brute force guess and check techniques.

Perhaps some of the AI skeptics are a bit too harsh, but they're certainly not crazy in the context of breathless hype.

dyauspitr

5 days ago

What are you talking about? It’s $20/month.

danielbln

5 days ago

OP is talking about API usage, and yes if you let Claude 4 Opus crunch on a long task it'll eat tokens/money like nothing else.

Those $20/month plans usually come with throttling, or other type of service degradation once you max out their allotment.

gspencley

5 days ago

The reason that I personally don't use LLMs was not addressed by the article:

I haven't found a way to use it that makes me develop faster.

The articles talks about "tedious code." If you need to generate a large static value table or something, then OK an LLM might give you a really fast result and cut through the tedium. Most of us were already writing short scripts to do that. I'm open to the possibility that an LLM can do it faster. But it's such a rare requirement that the productivity gains are truly negligible here even if they can. And in those cases, it's obvious what the repetitive task needs to be. I often find myself writing the code by hand to be quicker than coming up with a prompt to get it to write the code that I then need to review for correctness.

The article then mentions scaffolding. Things like "bookkeeping" when it comes to creating and setting up a new repo (whatever he means by that). This is why I have, historically, been a big fan of frameworks and generators. Point being, this is already a solved problem and I haven't found a way to further improve the state of this world with LLMs. LLMs might be an alternate tool that work just as well. But they haven't made my existing daily workflow any faster. Setting up new repos is also something that is done so rarely that even if an LLM netted a 100% increase in efficiency, it wouldn't really impact much.

I am an AI "skeptic" but I'm not a naysayer. I do use LLMs regularly. I just don't use them for developing code because I have yet to find a problem that they solve for me. Don't get me wrong, there are problems that they can solve... I just haven't come across any solutions to previously-unsolved problems. Meaning I can swap an existing solution for an LLM-based one... and it is a valid solution... but I don't observe any increase in productivity from doing so. The existing solution was already working fine.

I am genuinely looking forward to the day when this changes. When I identify a single existing problem without an existing solution that LLMs solve for me when developing software. I just have yet to come across one.

incomingpain

5 days ago

>I haven't found a way to use it that makes me develop faster.

I have pycharm pro with the ai. I can right click a function and there's 12 things AI can just do for you.

"find problems" where the linter has no concerns and it finds problems that dont show up from linter are shockingly good. This is like "Hey, you need to make sure this function is only running once at a time or it can easily overwhelm the cpu."

"write documentation" and it just fills in the documentation. I havent a clue what it writes and I wont be doing it.

"generate unit tests" is interesting? I have literally never written a unit test before and the code it produces is lengthy. Unit tests are a waste of time; but others disagree and so AI can produce it.

"explain code" look I wrote the code years ago half drunk, half high. No comments. It explains it all in depth and perfectly. Comments sometimes go stale after bug fixes and then make things confusing. Comments arent important when you have this beautifulness.

Lets go back to that cpu DOS attack example. I then can prompt what is claude 3.5 sonnet to give me to solution to fix it. It gives me this beautiful cache option. I could just hit apply and it'd be done. But then i said it's too complicated give me something simpler. It then gives me a solution that is simpler.

>nd in those cases, it's obvious what the repetitive task needs to be. I often find myself writing the code by hand to be quicker than coming up with a prompt to get it to write the code that I then need to review for correctness.

Its all about the boring busy work crap that doesnt need to be done that now gets done.

philote

5 days ago

It's hard to take a developer seriously when they don't document their code and think unit tests are a waste of time.

vdqtp3

5 days ago

And they're typically the ones cheering the loudest for LLM centric coding. It's hard to believe in something people you don't respect are saying is the best thing they've ever seen.

gspencley

5 days ago

> "generate unit tests" is interesting? I have literally never written a unit test before and the code it produces is lengthy. Unit tests are a waste of time; but others disagree and so AI can produce it.

This statement worries me for a number of reasons.

First, I work on a very massive codebase, with a large engineering organization. And I have seen a number of AI generated unit tests. I have not seen a single example of an LLM-generated unit test that didn't contain a number of test anti-patterns. To the extent where I would argue that they are creating a self-fulfilling prophecy. You said you think that unit tests are a waste of time. I would argue that they CAN be even worse than that.

The benefit of unit tests is that, at their best, they give you a safety net for refactoring existing code. If you change the implementation of a system under test, and the tests pass, you know you didn't introduce a breaking change.

But a lot of tests couple themselves tightly to the implementation details. Every single LLM-generated unit test I have observed in the wild introduces this anti-pattern. If you have a system under test, and changing the implementation of that system without breaking its behaviour causes a test to fail... that's called the "Fragile Test" problem. Now your unit test is not only failing to encourage you to refactor code... it's actively DISCOURAGING you from refactoring code. In this case, the unit test is providing DISVALUE rather than value.

So the fact that a) you think unit tests are a waste of time and b) you look at AI as a way to save you from a chore ... tells me that you have no business ever writing uint tests with or without AI. Please stop. You are making the world worse by leveraging an LLM to do these things for you.

I have NEVER looked at writing a unit test by hand as a "chore" or as a "waste of time." I often write my tests before even writing my implementation code, because doing so helps me think through both the design and requirements of my code... and gives me a little mini-sandbox context where I can make sure that the brand new code that I am writing is doing what I want it to. It's a problem solving tool. Not something to be done after the fact as a chore.

"Write Documentation" is not writing code. And if you don't read the documentation you're generating, no one else will. So what's the point of having it in the first place if no one reads it? Food for thought.

"Find problems" I see as being akin to a spell check, or the red squigglies when you have a syntax error. I do see the value of having ML tech within the internal IDE implementation. We've been using ML for email SPAM filters for decades. There are interesting problems that can be solved there. But this is an area where I want my IDE to flag something for me. It's not me actively prompting an LLM to generate code for me.

incomingpain

4 days ago

>First, I work on a very massive codebase, with a large engineering organization.

I sure dont. I'm a cyber security 0day writer or linux bash scripter or yikes coder, who has made some bad turns and ended up with some open source projects in python that absolutely nobody would call large or massive.

>The benefit of unit tests is that, at their best, they give you a safety net for refactoring existing code. If you change the implementation of a system under test, and the tests pass, you know you didn't introduce a breaking change.

Never refactor. This is literally why everyone loves objected orientated and paint themselves in a corner that blows up in their face.

>Now your unit test is not only failing to encourage you to refactor code... it's actively DISCOURAGING you from refactoring code. In this case, the unit test is providing DISVALUE rather than value.

That sounds perfect to me. Never refactor, you end of life or version that old code out, and replace it with the new perfect code. We have WPA #3, TLS 1.3, SSH 2, IKE v2, HTTP 3.

It's fine to replace until you get it right.

>So the fact that a) you think unit tests are a waste of time and b) you look at AI as a way to save you from a chore ... tells me that you have no business ever writing uint tests with or without AI. Please stop. You are making the world worse by leveraging an LLM to do these things for you.

We live in very different worlds. I do like the idea that you feel like im making the world worse by having AI do the chores.

>I have NEVER looked at writing a unit test by hand as a "chore" or as a "waste of time."

Me neither. I just dont do them. with how awesome AI has been... im probably going to have the time to read for the first time: https://docs.djangoproject.com/en/5.2/topics/testing/

>I often write my tests before even writing my implementation code, because doing so helps me think through both the design and requirements of my code... and gives me a little mini-sandbox context where I can make sure that the brand new code that I am writing is doing what I want it to. It's a problem solving tool. Not something to be done after the fact as a chore.

I dont code anything so complicated. I come from linux, but they call it the 'unix philosophy'. Do one thing and do it well. Kind of akin to KISS principle.

Write clean simple code that self-documents itself that legit cant really go wrong. Especially if some jerk comes along like me who throws a fuzzer at it or worse user input.

>"Write Documentation" is not writing code. And if you don't read the documentation you're generating, no one else will. So what's the point of having it in the first place if no one reads it? Food for thought.

Good point.

>"Find problems" I see as being akin to a spell check, or the red squigglies when you have a syntax error.

Community pycharm has all that for free. The "find problems" is entirely next level. I'm so very sold.

I just wrote a stupid def test(): which returns a string. The linter has not a single thing to say.

1. *Try Block Contains No Risk Operations*: - The try-except block is wrapping a simple string operation that won't raise exceptions - Makes the try-except block unnecessary

>I do see the value of having ML tech within the internal IDE implementation. We've been using ML for email SPAM filters for decades. There are interesting problems that can be solved there. But this is an area where I want my IDE to flag something for me. It's not me actively prompting an LLM to generate code for me.

Since im public open source project on github I got copilot for free like 3 years ago and my biggest complaint was that it was generating garbage code for me and forcing it. It got so annoying i disabled it and stopped using it. Im sure it has improved since then.

Pycharm's AI on occasion has a tab completion thats really good, but isnt forcing.

gspencley

4 days ago

> Never refactor. This is literally why everyone loves objected orientated and paint themselves in a corner that blows up in their face.

I try to get people to think about "write once" code as much as possible. Keep things small, single-purpose and composable. You maximize reuse potential, make testing a lot easier and greatly ease debugging.

But you're right about one thing: we do live in very different worlds. In the product world, the business never knows what they are making until something is in front of them and they decide they want something else. Therefore the majority of developer time is spent "maintaining" existing code (read: making code changes) vs building new code.

In that world, the "never refactor" mentality is suicide. Refactoring is improving the design of existing code in order to make it simpler and easier to change as requirements change.

And that is the one universal truth of software: it changes.

If you don't need the code to change, ever, then you might as well stick with a fixed circuit.

> I dont code anything so complicated. I come from linux, but they call it the 'unix philosophy'. Do one thing and do it well. Kind of akin to KISS principle.

So do I. I use Linux on all of my devices and came up as a *nix developer. Unit tests favour the unix philosophy. I often think of a "unit" of code as a single logical operation. But logic is a function of the human mind and humans make errors. Even in simple, one-liner, single-responsibility functions logic errors creep up. Then consider how your programming language helps you or aids you. I see type casting induced errors all the time that a simple unit test would have caught and avoided.

> Write clean simple code that self-documents itself that legit cant really go wrong.

Then why did you even bring up writing documentation lol. We're aligned here. I never document code because I consider that to be a failure of the code. Documentation and code need to be kept in sync... which never happens, not to mention people don't read it. So favour code that is self-explanatory.

> I do like the idea that you feel like im making the world worse by having AI do the chores.

You're deliberately straw-manning me there. If writing a unit test is a chore, your approach to testing is flawed. That was my point and you know it. My comment was also specific to using AI to write unit tests. Not to do anything else.

baobun

6 days ago

The privacy aspect and other security risks tho? So far all the praise I hear on productivity are from people using cloud-hosted models.

Claude, Gemini, Copilot and and ChatGPT are non-starters for privacy-minded folks.

So far, local experiements with agents have left me underwhelmed. Tried everything on ollama that can run on my dedicated Ryzen 8700G with 96GB DDR5. I'm ready to blow ~10-15k USD on a better rig if I see value in it but if I extrapolate current results I believe it'll be another CPU generation before I can expect positive productivity output from properly securely running local models when factoring in the setup and meta.

simonw

6 days ago

Almost all of the cloud vendors have policies saying that they will not train on your input if you are a paying customer.

The single biggest productivity boost you can get in LLM world is believing them when they make those promises to you!

JohnKemeny

5 days ago

> The single biggest productivity boost you can get in LLM world is believing them when they make those promises to you!

I'm having a hard time interpreting what you mean here. It sounds like something straight out of a cult.

simonw

5 days ago

An LLM vendor says to you "we promise not to train on your input". You have two options:

1. Believe them. Use their products and benefit from them.

2. Disbelieve them. Refuse to use their products. Miss out on benefiting from them.

I pick option 1. I think that's the better option to pick if you want to be able to benefit from what this technology can do for you.

Personally I think "these people are lying about everything" is a stronger indication of a cult mindset. Not everyone is your enemy.

baobun

5 days ago

Well, I've been personally lied to about privacy claims by at least Google, Meta, Amazon, Microsoft. Some of which has been observed in courts. OpenAI communication has obviously been dishonest and shady at times if you keep track. All of the above have fallen in line with current administration and any future demands they may have to pin down or cut off anyone opposing certain military acts against civilians or otherwise deemed politically problematic. DeepSeek's public security snafu does not instil confidence that they can keep their platform secure even if they tried. And so on.

Fool me twice, you can't get fooled again.

taurath

5 days ago

The worst part to me is how little anyone seems to care about privacy - it just is how the world is. The US economy (or at least almost all e-marketing) seems to run on the idea that there's no such thing as privacy by default. Its not a subject that is talked about nearly enough. Everything is already known by uncle sam regardless. Its really strange, or maybe fortunate, that we're basically at a place that we often worried about but things haven't gone totally wrong yet. Corporate governance has been not that terrible (they know that its a golden goose they can't unkill). We'll see what happens in the next decade though - a company like google with so much data but losing marketshare might be tempted to be evil, or in todays parlance, have a feduciary responsibility to juice peoples data.

easton

5 days ago

on the other hand, if AWS or Microsoft was caught taking customer data out of their clouds their business would be over. I don't know if AI has anything to do with it, inference is just another app they sell.

noobermin

5 days ago

Or for someone trying to convince you to give your code to train on for free.

simoncion

5 days ago

> ...have policies saying that they will not train on your input if you are a paying customer.

Those policies are worth the paper they're printed on.

I also note that if you're a USian, you've almost certainly been required to surrender your right to air grievances in court and submit to mandatory binding arbitration for any conflict resolution that one would have used the courts for.

simonw

5 days ago

How many paying customers do you think would stick around with an AI vendor who was caught training new models on private data from their paying customers, despite having signed contracts saying that they wouldn't do that?

I find this lack of trust quite baffling. Companies like money! They like having customers.

simoncion

5 days ago

If you pay attention, you see that the cost to large companies of reputational damage is very, very small. "The public" has short memories, companies tend to think only about the next quarter or two, PR flacks are often very convincing to Management, and -IME- it takes a lot of shit for an enterprise to move away from a big vendor.

And, those who are pay attention notice that the fines and penalties for big companies that screw the little guys are often next-to-nothing when compared with that big company's revenue. In other words, these punishments are often "cost of doing business" expenses, rather than actual deterrents.

So, yeah. Add into all that a healthy dose of "How would anyone but the customers with the deepest pockets ever get enough money to prove such a contract violation in court?", and you end up a profound lack of trust.

simonw

5 days ago

Companies tend to be a lot more cautious at spending their money than consumers are.

This space is fiercely competitive. If OpenAI turn out to be training on private data in breach of contract, their customers can switch to Anthropic.

oblio

6 days ago

This is probably the biggest danger. Everyone is assuming optimization work reduces cost faster than these companies burn through capital. I'm half inclined to assume optimization work will do it, but it's far from as obvious as they want to portray it.

NitpickLawyer

5 days ago

> So far, local experiements with agents have left me underwhelmed.

Devstral (mistral small fine-tuned for agentic use coding) w/ cline has been above expectations for me.

storus

5 days ago

MacStudio with 512GB RAM starts at around 10k and quantized DeepSeek R1 671B needs around 400GB RAM, making it usable for your needs. It produced some outstanding code on many tasks I tried (some not so outstanding as well).

baobun

5 days ago

Am I right in assuming that running Linux (or anything else than macOS) on the MacStudio is experimental at best?

I'd be looking for something that can run offline and receive system updates from an internal mirror on the airgapped network. Needing to tie an AppleID to the machine and allow it internet access for OS updates is a hard sell. Am I wrong in thinking that keeping an airgapped macOS installation up to date would additional infrastructure that requires some enterprise contract with Apple?

storus

5 days ago

IIRC you can download OS update/installation DMG from Apple, put it on a USB key and run it on airgapped system. I don't think you even need Apple ID. MacOS with homebrew works more-less like Linux, at least tooling is basically the same. You won't be able to install any Linux on M3 Ultra.

BeetleB

6 days ago

Privacy is not binary, and it would make it easier if you outlined specific scenarios.

Most providers promise not to train on inputs if used via an API (and otherwise have a retention timeline for other reasons).

I'm not sure the privacy concern is greater than using pretty much any cloud provider for anything. Storing your data on AWS: Privacy concern?

baobun

5 days ago

> Storing your data on AWS: Privacy concern?

Unencrypted? You bet.

Swiffy0

5 days ago

My 5 cents would be that LLMs have replaced all those random (e.g. CSS, regex etc) generators, emmet-like IDE code completion/generator tools, as well as having to google for arbitrary code snippets which you'd just copy and paste in.

In no way can AI be used for anything larger than generating singular functions or anything that would require writing to or modifying multiple files.

Technically you might be able to pull off having AI change multiple files for you in one go, but you'll quickly run into sort of "Adobe Dreamviewer" type of issue where your codebase is dominated by generated code which only the AI that generated it is able to properly extend and modify.

I remember when Dreamviewer was a thing, but you essentialyl had to make a choice between sticking with it forever for the project or not using it at all, because it would basically convert your source code into it's own proprietary format due to it becoming so horribly messy and unreadable.

Regardless, AI is absolutely incredible and speeds up development by a great deal, (even) if you only use it to generate small snippets at the time.

AI is also an absolute godsend for formatting and converting stuff from anything and to anything - you could e.g. dump your whole database structure to Gemini and ask it to generate an API against it; big task, but since it is basically just a conversion task, it will work very well.

Animats

5 days ago

Here are two routine problems I have to solve at the moment. Can any of the current LLM systems do either?

1. Input is an 256x256 pixel elevation map stored as a greyscale .png file, and a minimum and maximum elevation. A pixel value of 0 corresponds to the minimum elevation, and a pixel value of 255 corresponds to the maximum elevation. Read in the .png file and the elevation limits. Then construct a 256x256 floating point array of heights. From that array, construct a triangle mesh with X and Y dimensions 0..255. Perform a mesh reduction operation on the triangle mesh to reduce the number of triangles. Mesh reduction must not generate holes in the mesh. From the reduced mesh, generate a glTF file where the UV parameters run from 0.0 to 1.0 along the X and Y axes.

2. Given four glTF files constructed as above, corresponding to four quadrants of a larger square, construct a single 511x511 mesh which combines all four input meshes to cover a larger area. Because the input meshes are 0..255, not 0..256, there will be gaps where the four quadrants meet. Fill those gaps with reasonable triangles. Perform a mesh reduction as above. From the reduced mesh, generate a glTF file where the UV parameters run from 0.0 to 1.0 along the X and Y axes.

Rust code is preferred; Python code is acceptable.

So, what service should I sign up for?

csomar

5 days ago

Tackling 1: No current LLM will be able to do that. Create your Rust project, figure out the dependencies that could be useful on your own, then start by asking which types can be expressed. Then after you have revised the types, figure out (with or without LLM) which functions have to be implemented. Then ask LLMs to implement these functions with tests. With some modifications, you should be able to have a final working solution.

Edit: took a while but here is the code generated by DeepSeek: https://gist.github.com/omarabid/b30195011abf570fde2a084cd11... If you have an example file I could give it a test run.

Animats

5 days ago

> Tackling 1: No current LLM will be able to do that.

Github Copilot seems to be able to do it. Mostly because it knows how to call Meshlab. The documentation for Meshlab is hard to search, and having an LLM find the correct function is itself helpful.

Also, it looks like quadric decimation has improved since the last time I looked at it. That algorithm, if pushed too hard, will generate meshes with holes. Now there are options ("preservetopology") which supposedly prevent that. I last looked at that a few years ago, and the only mesh decimator which handled that correctly was the one inside Unreal Engine Editor.

Now I have to test all this stuff, of course.

energy123

5 days ago

How have you found LLMs' Rust capabilities compared to Python or Typescript or other more common languages with more publicly available data?

Animats

5 days ago

I tried some LLMs about six months ago, and got a halfway decent translation between Linden Scripting Language and Lua. I didn't expect that to work at all, but it did, despite very different syntax. The parent article says that if you haven't tried this in six months, it's time to try again. It looks like it's going to be useful for game geometry wrangling, at least.

LLMs seem to be good at finding obscure library functions from a description of the function needed, something where where keyword searches aren't too helpful. Previously, searching Stack Overflow helped, but nobody is posting there any more. Good reference manuals are almost historical artifacts now.

csomar

5 days ago

Rust is the best if you carefully create and document your types. Sometimes you’ll get stuck in lifetime issue but it’s a sign the LLM didn’t do a good job.

JKCalhoun

5 days ago

Why don't you paste that into Claude, ChatGPT — report back what you find?

Animats

5 days ago

I tried Github Copilot, free mode, on #1. The Python code outsourced all the hard work to numpy and pymeshlab, which is fine. Copilot wrote code to generate triangles. It's a reasonable job. Copilot's knowledge of what to call is better than mine. I have to try it now, but it did OK.

On #2, it set up the problem, but bailed on the gap filling part with

   # (This is a complex step; see note at bottom.)

That's not surprising, since I didn't tell it how to solve the problem. Can any of the premium systems do #2?

nickpsecurity

5 days ago

When I used it, Copilot was for doing autocomplete on little, routine procedures. Like a few operations done on a list. I used it to fill in the blanks of tedious stuff in what I was already coding myself. It worked well at that.

However, if it took creativity or real analysis, I had to throw the big models at it with a carefully-written prompt. You want to use their recent, best, big models. I used to include precise descriptions of data structures and function API's, too, which helped a lot. From there, tweak the description until you find pieces that get the result.

Another thing. It will eventually synthesize code that's close but not quite it. I found that asking it to make small changes, one at a time, would help for a while. Ex: "Modify that code to do X" or "...X in part Y." It eventually started hallucinating in loops. I always had to fix some amount of what it generated but it still saved time.

I hope that helps.

cap11235

5 days ago

https://gist.github.com/cpsquonk/e9a6134e78a2c832161ca973803...

I did Qwen3-256B (a free model, but you'd need a host for something that large, probably. I used Kagi) and Claude Code.

Curious how these look to you.

Animats

5 days ago

It actually wrote out the code for all the hard stuff.

I like the Python code which outsourced the hard stuff to existing libraries. The odds of that working are higher.

Can you tell it to use the "glam" crate for the vectors, instead of writing out things like vector length the long way?

(We now need standardized low-level types more than ever, so the LLMs will use them.)

cap11235

5 days ago

I reopened Claude, and asked "Can you use the "glam" crate for the vectors, instead of writing out things like vector length the long way?"

https://gist.github.com/cpsquonk/348009eb7c83a7d499ff5ae70d7...

Animats

5 days ago

That's pretty good. Thanks.

agnishom

4 days ago

This is a great example.

Supposedly, you know something about this problem. If you use an LLM to craft a solution, that's probably fine since you'd know how well you did.

I know nothing about this problem. I could use an LLM to delude myself into thinking that I do. Should my boss be very happy about that? They probably shouldn't

perfmode

5 days ago

try claude code, in python

r0s

5 days ago

Weird to claim the llm does all the boring learning and boilerplate for you as a selling point, but then also insist we still need to responsibly read all the output, and if you can't understand it's a "skill issue".

Also the emphasis on greenfield projects? Starting is by FAR the easiest part. That's not impressive to me. When do we get to code greenfield for important systems? Reminds me of the equally absurd example of language choice. You think you get to choose? What?

Imagine all the code these agents are going to pump out that can never be reviewed in a reasonable time frame. The noise generated at the whim of bike-shedding vibe coders is going to drown all the senior reviewers soon enough. I'll call that Cowboy Coders on Steroids. Anyone with skills will be buried in reviews, won't have time for anything else, and I predict stricter code gen policies to compensate.

rtpg

5 days ago

I don't have a huge dog in this fight apart from AI advocates being annoying... but I would say that for greenfield projects the interesting thing is that I can get a skeleton of a working iOS app for something simple in like an hour of some copy/pasting stuff from ChatGPT. Instead of spending a good amount of time trying to get through learning material to do it.

It's nice to build throwaway things _so fast_, especially in the sort of fuzzy stuff like frontend where it's fine for it to be completely wrong. And then I can just use my own sense of how computers work to fix up what I care about, delete a bunch of what I don't care about... It's pretty amazing.

For existing projects I have only witnessed garbage output. I know people have success. I haven't seen it.

I have witnessed PMs taking a bullet pointed list of requirements and then use ChatGPT to generate paragraphs of text for some reason. You had the list!

layla5alive

5 days ago

And (rigorously) reviewing code is easily 10x harder than writing it.

But this won't stop it from happening.

tptacek

5 days ago

This is just obviously not true. I had a full-time job of reviewing code for roughly 15 years and it was never true, but it's also just intuitively not true that engineers spend 10 hours reviewing their peers code to every 1 they spend writing it.

What you mean to claim here is that verification is 10x harder than authorship. That's true, but unhelpful to skeptics, because LLMs are extremely useful for verification.

JohnKemeny

5 days ago

I once graded over 100 exams in an introductory programming course (Python). The main exercise was to implement a simple game (without using a runtime).

Some answers were trivial to grade—either obviously correct or clearly wrong. The rest were painful and exhausting to evaluate.

Checking whether the code was correct and tracing it step by step in my head was so draining that I swore never to grade programming again.

tptacek

5 days ago

Right, sure. So: this doesn't generally happen with LLM outputs, but if it does, you simply kill the PR. A lot of people seem to be hung up on the idea that LLM agents don't have a 100% hit rate, let alone a 100% one-shot hit rate. A huge part of the idea is that it does not matter if an agents output is immediately workable. Just take the PRs where the code is straightforwardly reviewable.

JohnKemeny

5 days ago

But your reply was to "reviewing code is easily 10x harder than writing it". Of course that's not true if you just kill all PRs that are difficult to review.

Sometimes, code is hard to review. It's not very helpful if the reviewer just kills it because it's hard.

girvo

5 days ago

> It's not very helpful if the reviewer just kills it because it's hard.

I am absolutely still an AI skeptic, but like: we do this at work. If a dev has produced some absolutely nonsense overcomplicated impossible to understand PR, it gets rejected and sent back to the drawing board (and then I work with them to find out what happened, because thats a leadership failure more than a developer one IMO)

tptacek

5 days ago

I understand everything about this comment except the words "but" and "it's not very helpful if".

fzeroracer

5 days ago

You're massively burying the lede here with your statement of 'just take the PRs where the code is straightforwardly reviewable'. It's honestly such an odious statement that it makes me doubt your expertise in reviewing code and PRs.

A lot of code can not and will not be straightforwardly reviewable because it all depends on context. Using an LLM adds an additional layer of abstraction between you and the context, because now you have to untangle whether or not it accomplished the context you gave it.

enraged_camel

5 days ago

I have no idea what you mean. Tptacek is correct. LLM does not add an additional layer because at the end of the day code is code. You read it and you can tell whether it does what you want because you were the person who gave the instructions. It is no different than reviewing the code written by a junior (who also does not add an additional layer of abstraction).

fzeroracer

4 days ago

That's exactly what an additional layer is! If I outsource coding to someone else whether it's a junior engineer, an outside firm or an LLM that is an additional layer of context you need to understand. You need to figure out if the junior engineer grasped the problem set, if the firm understood the requirements or if the LLM actually generated decent code.

girvo

5 days ago

While I'm still a skeptic (despite my Cursor usage at work), I still absolutely agree. Careful review has never been 10x more difficult than writing the code, I'm not sure where this comes from. And I've had about the same experience as you.

Also, and know this doesn't matter, but it's so weird to see this downvoted. That's not an "I disagree" button...

tptacek

5 days ago

Thanks! On HN, it actually is a "disagree" button though; totally cromulent to downvote disagreement here. One of the site's big idiosyncrasies!

whyever

5 days ago

If it is a disagree button, then why is it disabled for new users?

tptacek

5 days ago

I think the karma thresholds here are kind of silly given how low they are, but Dan and Tom actually run this place and they know things I don't know about what chaos would be unleashed if we eliminated them.

girvo

5 days ago

Weird, I always read it as more than that due to the karma threshold! The more you know :)

meowface

5 days ago

I really don't think that's true at all. Anyone here can read and sign off on 1000 lines of code faster than they can write the same 1000 lines of code, pretty much no matter what the code is.

theshrike79

5 days ago

It also depends heavily on the language.

I can review 100x more Go code in a set amount of time than I can, say React.

With Go there are repetitive structures (if err == nil) and not that many keywords, it's easy to sniff out the suspicious bits and focus on them.

With Javascript and all of the useWhatevers and cyclic dependencies and functions returning functions that call functions, it's a lot harder to figure out what the code does just by reading it.

meowface

5 days ago

I definitely get your point, but it's also pretty annoying to write weird indirect or meta-functional code, not just to read it. It's still almost always faster to read than to write.

I can think of edge cases where a certain section of code is easier and faster to write than to read, but in general - in our practical day-to-day experiences - reading a lot of code is faster than writing a lot of code. Not just faster but less mentally and physically taxing. Still mentally taxing, yes, but less.

sothatsit

5 days ago

You have misunderstood this "common wisdom".

The idea is that people will spend 10x more time reading your code in all future time stacked together. Not that reading and understanding your code once takes 10x the effort of writing it, which is obviously untrue.

Here is the quote from Clean Code, where this idea seems to originate from:

> Indeed, the ratio of time spent reading versus writing code is well over 10 to 1.

sfn42

5 days ago

I think the point is that the person orchestrating the agent(s) reviews the code. It doesn't make sense to have 5 Juniors using agents to submit PRs and a senior or two reviewing it all. You just have the senior(s) orchestrating agents and reviewing it themselves. Maybe one or two juniors because we still need to train new devs, but maybe the junior doesn't even use an LLM. Maybe the junior writes code manually so they can learn programming properly before they start being an "AI lead".

Everyone is still responsible for the code they produce. I review my own PRs before I expect others to, and I don't even use AI. I think what the article describes seems interesting though.

otabdeveloper4

5 days ago

Yeah, I remember five years ago (before being sold this latest VC crock of bull) when reviewing shitty thoughtless code was the tedious and soul-sucking part of programming.

AnotherGoodName

6 days ago

This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.

It was obviously huge. You could see it taking off. Yet a lot of people proudly displayed ignorance and backed each other up on it to the point that discussion around the topic was often drowned out by the opposition to change. Now today it takes minutes of playing with ai coding agents to realise that it’s extremely useful and going to be similarly huge.

Resistance to change is not a virtue!

okanat

6 days ago

I still think smartphones are a huge negative to humanity. They improve a narrow case: having access to ephemeral knowledge. Nobody writes articles or does deep knowledge work with smartphones.

My position with the AI is almost the same. It is overall a net negative for cognitive abilities of people. Moreover I do think all AI companies need to pay fair licensing cost to all authors and train their models to accurately cite the sources. If they want more data for free, they need to propose copyright changes retroactively invalidating everything older than 50 years and also do the legwork for limiting software IP to 5 to 10 years.

Mofpofjis

5 days ago

Agreed. Smartphones are portable, mobile computers that suck at every single aspect of being, and working as, a computer, except for mobility. They are not powerful, they are not general purpose, they are not ergonimic, they do not respect user freedom or privacy. Use such a mobile device only when you can't avoid it (i.e., when you are on the road -- when mobility is the single most important priority), and at no other time.

simonw

6 days ago

"Nobody writes articles or does deep knowledge work with smartphones."

I don't think that's true.

I do most of my reading on a smart phone - including wading through academic papers, or reading full books in the kindle app and jotting down notes in the digital margins.

A sizable number of my short form blog entries are written on my phone, and my long form writing almost always starts out in Apple Notes on my phone before transferring to a laptop.

Predictive text and voice dictation has got good enough now that I suspect there have been entire books written on mobile devices.

Whether you want to consider it "deep knowledge work" or not is up to you, but apparently a lot of Fifty Shades of Grey was written on a BlackBerry! https://www.huffpost.com/archive/ca/entry/fifty-shades-of-gr...

steve_adams_86

6 days ago

> It is overall a net negative for cognitive abilities of people.

I agree. A bunch of us here might use it to scaffold applications we already understand, use it as a rubber duck to help understand and solve new problems, research more effectively, or otherwise magnify skills and knowledge we already have in a manner that's directed towards improving and growing.

That's cool. That's also not what most people will do with it. A bunch of us are total nerds, but most of the world really isn't like that. They want more entertainment, they want problems solved for them, they want ease. AI could allow a lot of people to use their brains less and lose function far more. For the minority among us who use it to do more and learn more, great. That group is a tiny minority from what I can tell.

Take for example that a huge use case for generative AI is just... More sophisticated meme images. I see so much of that, and I'm really not looking for it. It's such an insane waste of cycles. But it's what the average person wants.

Karrot_Kream

6 days ago

Just smartphones? I'd start at agriculture. Before agriculture, human society had little hierarchy. We were free the way we were meant to be.

homebrewer

6 days ago

> today it takes minutes of playing with ai coding agents to realise that it’s extremely useful

Yet some of us spent hours over the past three years playing with LLMs, and remain completely unimpressed by what we see.

suddenlybananas

6 days ago

Well don't you realise you need to try Jean 4.31 or Cocaptain 8.3E and then you'll see what the models are capable of!

orange-mentor

5 days ago

My dude: Cocaptain 8.4 just dropped! You're living in the past. Don't get left behind!

oblio

6 days ago

Smartphones were financially viable from day 1, though. I think LLMs will be used a lot and in a lot of places but the current level of investment they're getting right now feels out of line to me. Kind of like what I expect them to get in 10 years from now, when they're mature.

cesarb

5 days ago

> Smartphones were financially viable from day 1, though.

Also, they were nothing more than the combination of two things which already existed, and which were already successful and financially viable: cellular phones and PDAs. In fact, I (and plenty of others I presume) already used that combination before smartphones: a PDA and a cellular phone, connected through Bluetooth to share the network connection.

bawolff

6 days ago

Some days i still don't understand why anyone would want a smart phone. I think being connected all the time has a significant negative impact on mental health (I say, as i type this from a smartphone)

ceejayoz

6 days ago

Sometimes it works that way.

Sometimes it’s more like NFTs.

postalrat

6 days ago

Sometimes. Most people with an opinion of NFTs thought they were a joke. Hardly anyone thinks LLMs are a joke.

jsheard

6 days ago

I have bad news about our illustrious hosts: https://www.ycombinator.com/companies?query=web3

They're not alone either, a bunch of the AI bankroll is coming from people who were also sold on crypto taking over the world.

Karrot_Kream

6 days ago

Slashdot then and HN now, predicted 100 out of the last 10 recessions.

zahlman

5 days ago

> This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.

I am in my 40s; have never owned a smartphone and still can't imagine wanting one; am constantly surrounded by others who have them and thus am completely aware of their functionality; AMA.

tom_

6 days ago

And today everybody has a smartphone, pretty much. So what difference did it make, the opinion you had, whatever it was? In the end, none at all.

spacechild1

5 days ago

It makes a huge different. Not exactly for the better, I have to say.

hooverd

6 days ago

Unfortunately, they were right. Smartphones are a menace, especially in terms of traffic violence.

spacechild1

5 days ago

> This happened with the introduction of smartphones too. Every slashdot post had a haughty and upvoted ‘why would i want such a thing!!!’.

This is a funny example because the devastating effects of smartphone addiction on society are now blatantly obvious. There are in fact very good reasons for not 'wanting such a thing'.

Don't get me wrong, LLMs can be incredibly useful and I think they deserve some of the current hype. Claiming that LLMs are useless is indeed silly and can be rightfully ignored. But there are serious concerns about potentional (or actual) negative effects on society, and these should be taken seriously.

BeetleB

6 days ago

To be frank, I do think smartphones have made my life worse. I'd happily forego them if it were not for 2FA and how too many businesses expect I can receive texts.

zahlman

5 days ago

> I'd happily forego them if it were not for 2FA and how too many businesses expect I can receive texts.

Vertical flip phones from forever ago can handle both of these just fine.

onemoresoop

5 days ago

Are you not innundated by spam calls and messages? To me that alone makes the flopphones unusable.

BeetleB

5 days ago

1. Can I buy non-smartphone vertical flip phones?

2. Most 2FA I deal with involves an Authenticator style app.

3. Missing the point: I want to disable texts altogether. For the decade prior to having a smartphone, I had a cell phone with texts disabled (specifically called the provider to disable them).

nilirl

5 days ago

The one main claim the article makes: Senior developers should not ignore the productivity gains from LLMs.

Best use of evidence is deductive: Lots of code is tedious and uninteresting -> LLMs are fast at generating lots of tedious code -> LLMs help productivity.

Weakest part of the argument: The list of rebuttals doesn't have an obvious organization to it. What exactly is the main argument they're arguing against?

It's not stated outright but because the post is bookended by references to 'those smarter than me', I think this is an argument against the shaming of developers using (and loving) LLM tools.

Which I think is fair.

Overall, the post did not add anything to the general discussion. But the popularity of the author (and fly.io posts) may make it a beacon for some.

puttycat

6 days ago

> An LLM can be instructed to just figure all that shit out. Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better.

Well, except that in order to fix that 1% you'd need to read and understand whatever the LLM did and then look for that 1%. I get the shills just thinking about this, whether the original programmer was human or not. I'd rather just write everything myself to begin with.

nojito

5 days ago

Reading code is vastly quicker than writing it from scratch.

abenga

5 days ago

This is a lie, unless you are just speaking of the mechanics of cursorily skimming it. Reading code to understand it enough to commit it in my name is harder and slower than writing it myself.

nojito

5 days ago

So you don’t use any libraries in your code?

How about reading documentation?

Or reviewing PRs?

Or go through source code to build out new features?

abenga

5 days ago

I don't commit entire libraries as written by me. Documentation is also far easier to read. Reviewing PRs is quite involving, since I want to do it well. Easier than typing code for sure.

dymk

5 days ago

Firstly, you are not the other poster, so I don't know how you can say it's a lie with conviction.

Second, reading code to understand it is a skill that you need to practice to get better at. That might be your issue.

sarchertech

5 days ago

It’s not the OP’s issue. That reading code is harder than writing it is as close to a consensus belief as we get in this industry.

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

simonw

5 days ago

Which is why it's a key skill that differentiates professional software engineers from amateurs.

sarchertech

5 days ago

You think Joel is an amateur? And the scores of other professional engineers who have written about this are also amateurs?

I mean they could be wrong, but I don’t think you can call them amateurs.

simonw

5 days ago

I think Joel is a professional, and that's why he says "reading code is harder than writing code".

You seem to be misunderstanding me. I'm not saying "anyone who thinks reading code is harder than writing code is an amateur". I'm saying "reading code IS harder than writing code, which is why being good at reading code is what separates experienced programmers from beginners."

sarchertech

5 days ago

Ok so you disagree with the person that kicked this off with

“Reading code is vastly quicker than writing it from scratch”

simonw

5 days ago

Only if you're not very good at reading code. If you've invested a lot of effort in improving your code reading and code review skills you can work a lot more productively with LLMs than people who don't have good code reading skills.

sarchertech

5 days ago

So you think that reading code is harder, but if you put in equal amounts of practice at reading and writing code you’ll get faster at reading code than writing code?

Sounds to me like you don’t think reading code is harder.

simonw

5 days ago

I think reading code is harder than writing code.

I think that if you put the work in you can get to a point where you are fast enough at reading and reviewing code that it's not a net productivity loss to have an LLM - or your coworkers and collaborators - write code for you to review, as opposed to insisting on writing every line of code yourself.

I'm a little surprised that my opinion about this is seen as controversial!

matt_kantor

5 days ago

Your position is confusing to me as well.

> if you put the work in you can get to a point where you are fast enough at reading and reviewing code

Please correct me if I'm wrong, but "fast enough" here would still be slower than writing the code yourself (given equal amounts of practice at reading and writing code), right? To throw some made-up numbers around: if it would take me 20 minutes to write code to do X, it might take me 30 minutes to read/review code that does X written by somebody else (or an LLM), so I'm at a net loss of 10 minutes. Can you explain the mechanism by which this eventually tips into a productivity gain?

Personally, I think "reading code is harder than writing code" lacks nuance. While I believe it's true on average, the actual difficulties vary wildly depending on the specific changeset and the path it took to get there. For example, writing code can involve exploring many solutions before eventually discovering a concise/simple one, but when reading the final changeset you don't see all those dead-end paths. And reviewing nontrivial code often involves asynchronous back and forth with the author, which is not a factor when writing code. But please take the "reading code is harder than writing code" claim for granted when responding to the above paragraph.

matt_kantor

5 days ago

Maybe you're imagining a team context and accounting for the total time spent by the entire team, and also imagining that LLM changesets aren't first reviewed by you and submitted in your name (and then have to be reviewed again by somebody else), but rather an agent that's directly opening pull requests which only get a single review pass?

simonw

5 days ago

It's more like it takes me five minutes to read code that would have taken me an hour to write.

matt_kantor

5 days ago

Ah, then it seems like you don't agree that reading code is harder than writing code (for you). Or maybe you're decoupling hardness from time (so it's five difficult minutes vs an easy hour).

simonw

5 days ago

For the first 15 years of my career I found reading code much harder than writing code. Then I invested a lot of effort in improving my code reading and code reviewing skills, with the result that code reading no longer intimidates me like it used to.

That's why I think reading is harder than writing: it takes a whole lot more effort to learn code reading skills, in my experience.

matt_kantor

5 days ago

Thanks, now I understand your perspective.

It seems like your answer to sarchertech's upthread "if you put in equal amounts of practice at reading and writing code you'll get faster at reading code than writing code" question might be "yes". Either that or you've intentionally invested more in your reading skills than your writing skills.

simonw

5 days ago

I'm not sure if "equal practice" is exactly right, but my opinion is that code reading and code review are skills that you can deliberately strengthen - and strengthening can help you get a lot more value out of both LLMs and collaborative development.

sarchertech

5 days ago

Oh course reading code is a skill that can be strengthened. Just like writing code can.

But if reading code is indeed harder than writing code, it stands to reason that if you put in equal effort to improving reading and writing abilities, your writing abilities would improve comparatively more.

If you spent all this time and effort learning to read code, such that you can read code 6x faster than you can write it, how do you know that you couldn’t have spent that effort improving you’re writing abilities such that you could write code 6x faster.

On the other hand if you did spend the same effort deliberately trying to increase you’re writing abilities as you did reading and the result is that you can read 6x faster than you can write, I’m unsure how you can support the conclusion that reading code is harder than writing it.

My gut feeling is that people on the far pro AI side of the spectrum tend to be people who are early in their career who don’t have strong writing or reading abilities (and so don’t really see the flaws) or people who have a reached a level where they aren’t really ICs anymore (even if that is their title). The latter have better reading than writing abilities because that’s what they spend all day doing.

Not that reading code is something that has an inherently higher skillcap than writing it.

I think there’s also a 3rd type of heavy AI proponent—people who spent most of their time cranking out MVPs or one offs that don’t require heavy maintenance (or they aren’t the ones doing the maintenance).

That’s not to say that I don’t think AI isn’t useful in those cases. I use AI pretty often myself when I’m writing in a language I don’t use everyday. Or when I’m doing something that I know has existing algorithmic solutions that I can’t quite remember (but I’ll know it when I see it) because it’s faster than googling. But I also recognize that there are many styles of programming, people, and domains where the productivity gains aren’t worth it.

simonw

5 days ago

No matter how hard I train my fingers will never be able to move fast enough to output 100 lines of code in 15 seconds.

When I get to writing actual production code with LLMs I treat them more as typing assistants than anything else: https://simonwillison.net/2025/Mar/11/using-llms-for-code/#t...

sarchertech

5 days ago

As someone who has had RSI issues in the past, I empathize with this, and I tend to use AI similarly.

But that’s not a fair comparison. You typed the equivalent of 15-20 lines of code to generate 100, and you also needed additional for reading/understanding that code.

I have no doubt that a programmer who worked with the relevant APIs frequently enough could have written that function faster than the total time it took you to do all those things.

Now that programmer in a less familiar domain could probably benefit from AI, but I get where people with different experiences are coming from.

abenga

5 days ago

The problem is not that I don't know how to do it. I am just asserting that Effort(Reading Code) > Effort(Writing Code) especially if `git blame` will spit out my name, not saying anything about their absolute difficulty.

tines

5 days ago

But creating a mental model and then interacting with it is slower than interacting with a mental model you already have.

orsorna

5 days ago

If you are resource constrained this article will make you sad.

I do not have unlimited funds to plug in some token and burn a bunch of money when writing code.

I am gpu poor. I'm lucky that 8gb vram can run the smallest models. But the output is so poor that I lose out to anyone using a hosted service.

If anything this article shows that building great programs is less democratized than it once was.

philomath_mn

5 days ago

ChatGPT is $20 / month?

simonw

5 days ago

And Gemini is free. https://aistudio.google.com/ gets you free access to their best models.

OpenAI and Anthropic both have free plans as well.

orsorna

5 days ago

Neither of you read the content of the OP methinks. Many of the AI skeptics framed in the blog do not leverage agentic frameworks. In fact, they are explicitly turned off by how the chat interface cannot tackle large codebases as you can't just throw vector embeddings into the chat window. Thus, they write off AI completely. In order to do this you need to set up your IDE and connect it to an API, either local or one of the paid services. I am not aware of any paid service that has an unlimited token plan through the API.

Dylan16807

5 days ago

It doesn't matter if they offer unlimited tokens or not. You're not using unlimited tokens. What matters is how many tokens you need to get good results, and whether you can get that many tokens at a good price.

simonw

5 days ago

GitHub Copilot has a free tier these days. It's not 100% free no matter how much you use it but it's generous enough that you can get a feel for if it's worth paying for.

captainbland

5 days ago

A couple of thoughts:

On 'just read the code' - all well and good. Of course this implies insisting on team members who can read the code, will read the code and are empowered to read the code.

Otherwise orgs will try to hire pure vibe coders who aren't interested in that and only allow time for literally just instructing agents to generate code because it sounds cheaper and execs don't understand the nuance so long as it looks like product is shipping - until it all blows up and the one standing senior developer on hand is supposed to fix a prod issue buried in millions of lines of vibe reviewed code ASAP.

On 'but it's cheaper than a junior': cloud hosted LLM systems are currently massively subsidised to an absurd degree. The cost side of things is all smoke and mirrors geared towards accelerated market adoption at all costs. It's not a profitable enterprise at the model development level.

At some point that AI economy is going to expect to make that money back, and future (especially near-future) hardware advancements don't explain where all of that is going to come from.

ken47

5 days ago

Let’s not conflate LLM’s with AI. LLM’s are a kind of AI that can be a software engineer’s assistant, at best. But the degree of hype is undermining belief in AI among many professionals who the hypesters claim are going to be replaced. No, this iteration is not going to replace doctors or engineers. But the degree to which the hype train wants to do so is alarming.

cheevly

5 days ago

I'm curious what walls you've hit? I literally can't find the boundaries of what is possible to achieve.

ken47

5 days ago

Search engineering forums online. If you haven’t found the walls, you haven’t been trying.

xigency

5 days ago

Have you tried creating your own programming language? How about solving unsolved frontier problems in mathematics? Ever written a book that won a Pulitzer prize? How many languages do you know?

As someone who was born ambitious I find this technology tepid at best.

jcims

5 days ago

The only question for me is whether or not the walls are moving in the direction of greater actualized capability.

I can't prove it but I *think* the answer there is yes.

runeks

5 days ago

> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious.

You still need to read this tedious code to verify that it actually does what you want it to do. Given this, I'd much rather prefer to write the tedious code myself than having to make sense of someone else's tedious code.

officehero

5 days ago

I've reproduced this type of sentiment over several of my hobby projects and it usually hits hard toward the latter half of the project and its usually related to bugs. In the beginning it's all great and fast with LLMs, but then there's a kind of steep drop in motivation as one realizes one has to dig as deep anyway into the tedious stuff, because there are always bugs.

HenriTEL

5 days ago

I'm working with python and it does not work very well, LLMs usually generates at least an order of magnitude more code than what I would write. That code often uses outdated practices, does poor design choices and does not understand hints like writing code in a way that reduce cognitive load, even when we explain that it means, i.e. keep the number of classes and functions small.

It's the complete opposite of OP's main language GO, as he says:

> Go has just enough type safety, an extensive standard library, and a culture that prizes (often repetitive) idiom. LLMs kick ass generating it.

Python is an interpreted dynamically typed language and the static type checkers are not there yet (most popular 3rd parties libraries have no type hints for example). Also it allows for many different programming styles that the LLMs struggle to choose from.

1. Every extra line of code is much more risky. 2. It's much harder to verify the LLM's code.

On the other hand I think rust will be in a good place in regards to LLMs in the next few years thanks to the robustness of the language and the quality of its diagnostic messages. Those 2 attributes should compound very well.

lubujackson

5 days ago

I use Cursor with Python and it works great. Things you should try:

  * use type hints everywhere.
  * use dataclasses everywhere.
  * add the dataclass files to your prompt.
  * if the produced code is too complex, ask the LLM "Is this the simplest and cleanest way to solve this?"

Voltage

5 days ago

I'm on the side of AI is a huge productivity booster (by my guess... 10x)

But I don't want to make the claim lightly, so I did an experiment.

I signed up for copilot pro, and have been using their 'edit' feature. This is more than just their auto complete.

I set myself a goal to create a playable web game of classic frogger.

It took 4 hours with copilot "edit" and my full attention.

I didn't write a single line of code, but I did ask it to refactor and gave it a project description.

I suspect this would have taken me 4 days full time to get to this level.

Try it out:

https://dev.zebrar.com/jd/frogger/

guax

5 days ago

So less than 10x already.

Question, how familiar are you with the used technologies? My experience on where Ai have been useful so far is things I don't have a good understanding on but, when I do, its a different ball game, mostly because coding it directly seems faster since I know exactly the behaviour I am looking for and I am not having to deal with unintended consequences.

I see it as the Alice cat thing, when you don't know where you going, any road will take you there. So its been great for exploratory work and prototyping.

Voltage

4 days ago

Yeah, I'm very familiar with the tech, I've been interested in games dev and web dev for a few decades now. So you could be right, that the models aren't ready to "play on their own" yet.

I tried doing a warcraft 1 clone, but that felt too complex for the model being used (openai 4.1). That model was just the default setting in copilot.

I dug a little deeper this morning, and it turns out I hadn't actually enabled my copilot 'pro' mode, which has granted access to some more current or dev focused models. So I'll take them for a spin to see what they're capable of.

My goal here is to roughly get a sense for when a task is too complex for an "agent" to handle.

I also want to try adding custom tools to suit certain project needs. For example, Unreal Engine has a python editor interface, so I'd like to have an agent drive the editor to build something. I have my doubts.

Once I have a feeling for what level of complexity can be handled, I'll see if I can manage the tools better using this understanding, by breaking large and complex projects into appropriate chunks of work / complexity.

/endBrainDump

Voltage

5 days ago

* It's not mobile friendly, and you use arrow keys to move around

GenshoTikamura

5 days ago

> a playable web game of classic frogger

You could create a Tetris in even less time. Try to create something more novel and less hot on the heatmap of paths walked in gamedev.

Voltage

5 days ago

I will. I'm just trying to evaluate the tools on a medium size task. I'm trying copilot's agent mode with the goal of creating a warcraft 1 clone. Or something resembling it.

agnishom

5 days ago

I think the key premise here is that one can effectively and efficiently audit code that the LLM is producing.

I doubt that. First, human attention and speed is very limited. Second, when I see something, I am already predisposed to assume that it is right (or at the very least, my subsequent inquiries are extremely narrow and anchored around the solution I have seen presented to me.)

klabb3

5 days ago

Yeah. Reviewing code thoroughly is extremely time consuming. When you review code from a human, you can look for choices they made fairly quickly - say they used framework X and language feature Y. Most importantly, you assume that they’ve checked certain things to work. This way, code review can be fast but it still isn’t thorough on its own. Most of it is trust and bureaucracy (big companies do this also to prevent malicious employees from smuggling in backdoors etc).

Code from LLMs that looks right, clean and even clever poses as competence but are prone to hallucinations and business logic errors. In the short term, these changes will pass through due to their appearance but contain more issues than a human would have with the same code. In the medium term, we just lose that signal - the assumptions we can make about the authors state of mind and comprehension. It’s already incredibly hard to distinguish solid points from nonsense, when the nonsense is laundered by an LLM.

nprateem

5 days ago

Yeah the problem is the fatigue of repeatedly reviewing code.

You do a few iterations until code runs, review carefully but notice a bug. So you do another iteration and 40% of code changes. Now you need to review again but you need to understand how the changes fit in.

Repeat this a few times and it becomes very tiring.

Ultimately you can't trust them not to do stupid shit. Your tests fail and you tell it to stop that? Sure, we can just catch those exceptions and the tests pass, etc. You get pissed off an tell it to FIX the CODE so the tests pass and the cycle continues.

It's like working with a potentially gifted moron.

einrealist

5 days ago

Exactly. And even if LLMs become more accurate and qualitative better, it just raises the costs for validation. Errors and bad design decisions become harder to spot.

rednafi

5 days ago

A lot of people who are wary of LLMs aren’t against the technology itself, but rather the narrative surrounding it. You can take advantage of the tool while turning a blind eye to the discourse.

This 16-minute, expletive-filled, edgy-old-man-trying-too-hard-to-be-cool article could easily be dismissed as yet another AI creed that somehow found its way to the top of the HN front page.

kokaybro

5 days ago

I will create an account just to point out that the response to

> But it is an inherently plagiarist technology

Was

> Developers frequently engage in copyright infringement, and so will I, so unless if you're a lawyer, shove it up your ass

"I am a bad person so I get to continue being bad" is not the gotcha you think it is, Patrick.

roncar

5 days ago

When I made a mid career change to Cobol programming in 1981 my first manager (a suit as programmers referred to them then) pointed to a book on his desk; the title,as I can best recall was Programming Without Programmers. He "You got in too late," he said. I retired from programming in 2010. The hype has a long history. I hope I'm around long enough to see how this plays out.

munificent

6 days ago

"Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged."

I would jump off a bridge before I accepted that as my full-time job.

I've been programming for 20+ years and I've never wanted to move into management. I got into programming because I like programming, not because I like asking others to write code on my behalf and review what they come up with. I've been in a lead role, and I certainly do lots of code review and enjoy helping teammates grow. But the last fucking thing I want to do is delegate all the code writing to someone or something else.

I like writing code. Yes, sometimes writing code is tedious, or frustrating. Sometimes it's yak-shaving. Sometimes it's Googling. Very often, it's debugging. I'm happy to have AI help me with some of that drudgery, but if I ever get to the point that I feel like I spend my entire day in virtual meetings with AI agents, then I'm changing careers.

I get up in the morning to make things, not to watch others make things.

Maybe the kind of software engineering role I love is going to disappear, like stevedores and lamplighters. I will miss it dearly, but at least I guess I got a couple of good decades out of it. If this is what the job turns into, I'll have to find something else to do with my remaining years.

bmink

5 days ago

I feel this so much.

I am always amazed how so many software engineers seem to dislike coding, which seems to be a major underlying theme in the AI-coding cheerleading.

Coding never feels tedious to me. Talking to a chatbot, now that’s tedious.

Kilenaitor

5 days ago

God. Thank you.

> But the last fucking thing I want to do is delegate all the code writing to someone or something else

I talked about this a few days ago with coworkers and I phrased it almost exactly this way. I want to be an IC. I want to do the work. Becoming a manager-of-one to funnel all my work through would deprive me of genuinely my greatest joy in life.

I'm envious you managed a few good decades. I'm almost eight years into what has been my dream profession and to see indications it's going to morph from writing code into telling a robot to write code is just so demoralizing. I could have done this for decades. I wanted to do this for decades.

GenshoTikamura

5 days ago

Keep it up. There'll be some rough turbulence until this transhumanist trainwreck dust settles. The human spirit shall prevail.

citizenpaul

5 days ago

That snippit you quoted sounds like what some aristocratic fop that has never worked a day in their life creative writes while they are LARP'ing what they think the poor's ideal workday would sound like in an AI sales pitch meeting.

It rings about as authentic as "everybody stood up and clapped"

munificent

5 days ago

You say that, but I hear AI folks talk about that user experience all the time.

And at some level, it makes sense. There are a lot of extroverts out there for whom the ideal job really is feeling like an orchestra conductor delegating and coordination an army of others. There is a great feeling of empowerment in watching a group build something bigger than you could have made on your own. And, if you're not someone who really likes to get their hands dirty and do the thing yourself, why not aspire towards a style of working that always feels that high level and powerful?

And, obviously, people who are wired that way are exactly the kind of people to move into management and executive roles and end up being in charge of and evangelizing AI to the world. So it's no surprise you hear people talking about this style of working all the time.

I'm 47. I've had the opportunity to move into management many times. I've been in a lead role enough to know the positive sides of that job and I deeply enjoy the soft skills part of working with a team.

I just know who I am and know that on days where I don't feel like I really made something concrete myself, I feel like I didn't live a real day.

citizenpaul

5 days ago

>if you're not someone who really likes to get their hands dirty and do the thing yourself

My life experience is that this represents modestly something north of 90% of the population. Most people are not as self motivated as HN crowd is. My cynic wants to say 99.9% Which is how I know its marketing. Its for people that fancy themselves as something they are not. IE fantasizing.

Its like all those commercials where the "boss" strolls in points to some vaguely business looking stuff and everyone runs off to work then the "boss" character goes to the beach or something. Its a fantasy sales pitch in other words. Work one minute a day and be rich.

I guess I'm getting off track which is not how AI helps workflow but that all this stuff we are seeing is marketing and there is little tangible evidence that AI helps enable the workflow you are referencing.

molochai

4 days ago

Same, thank you.

I have a friend and coworker who is currently struggling with this thing and similar "not making anything" issues from having worked a "maintenance/support" job consisting of little day-to-day make-creation. He took a leave for two months. I really don't think he's coming back.

It is a loss for people he works with and the things he touches. Not to mention a loss for me, because he was the one person I found at my current job with the "make stuff, be useful" attitude. Already people who see that writing on the wall are dropping out.

Sooner or later, me too, if that is in fact the way it goes. The only thing that really keeps me going is that I don't have anything else to turn to, and I do have some cloudy optimism about getting a lot of money and satisfaction in a couple years to help clean up or rework the garbage-deluge.

Your other comment about extroversion makes me think of Gifts Differing (not the best book in the universe, but some good bits especially around intro/extroversion). I just want to hit flow state and smash my head into a problem that costs me sleep and come out on top bloodied but happy and feeling like I've done something. And this is/was a really great career for that.

> This faculty of concentration is likely to characterize the introverts' careers. Whereas extraverts tend to broaden the sphere of their work, to present their products early (and often) to the world, to make themselves known to a wide circle, and to multiply relationships and activities, the introvert takes the opposite approach. Going more deeply into their work, introverts are reluctant to call it finished and publish it, and when they do, they tend to give only their conclusions, without the details of what they did. This impersonal brevity of communication narrows their audience and fame, but saves them from overwhelming external demands and allows them to return to another uninterrupted stretch of work. As Jung is reported to have said, the introverts' activity thereby gains in depth and their labor has lasting value.

alex_suzuki

5 days ago

Thank you so much for writing this. I started feeling like an oddball after reading most of the previous comments here.

dottedmag

5 days ago

Thank you, that's what I feel too.

For me switching the career, after spending more than 20 years in this as well... is very hard. I spent all my career outside of high-pay places like SV telling myself "I have all the time in the world, I don't need to grab most amount of money as soon as possible", so retiring is not an option.

So, switch to what? Any well-paid profession is going to be under pressure to be LLMized as much as possible.

Mofpofjis

5 days ago

> I got into programming because I like programming, not because I like asking others to write code on my behalf and review what they come up with

oh finally someone else who didn't enter programming because, as 7-10 year old child, they were into SOLVING PRACTICAL PROBLEMS FOR PEOPLE.

> But the last fucking thing I want to do is delegate all the code writing to someone or something else

Thank God there is at least one other person that understands that the ratio between creative and reactive work is crucial for wellbeing at the job.

For crying out loud.

> but if I ever get to the point that I feel like I spend my entire day in virtual meetings with AI agents, then I'm changing careers

so am I.

> but at least I guess I got a couple of good decades out of it

Thanks for this perspective. Yes, at least we've got our memories, and the code locations and commits we recall from memory, from a distance of 10 or more years.

>. If this is what the job turns into, I'll have to find something else to do with my remaining years

Me too.

munificent

5 days ago

> as 7-10 year old child, they were into SOLVING PRACTICAL PROBLEMS FOR PEOPLE.

Some of my fondest childhood memories are sitting in my school's resource center in front of a TRS-80, laboriously typing in some mimeographed BASIC code while wondering, "Is this the most efficient way I can increase shareholder value for the corporation?"

viccis

5 days ago

I'm with you two. I can work on boring problems in boring domains and enjoy the design and implementation aspect because it's inherently creative. Take away those and now my only job is the guy in Office Space who takes the requirements from the customers to the developers, if I'm lucky enough to have one at that point.

I don't want to have to change careers, as this is one that I've been working towards to some degree since I was a child. Including some intense work in college and some brutal first couple jobs where I worked hard to pick up marketable skills. Obviously the market doesn't care what I want, but I find the author of this piece to be a bit too flippant in his "but they take-rr jerbs" section. Working hard to get a well paying job only to have to start (likely near the bottom) in another career for much less pay is not something to treat lightly.

metaltyphoon

5 days ago

Tell me when this bus leaves because I’ll get on it with you.

GenshoTikamura

5 days ago

> I get up in the morning to make things, not to watch others make things.

If you ever followed down the rabbit hole of children content on YouTube, the one that infants and toddlers end up after an hour of autoplay with some innocent cartoon at the beginning, you'd find among the piles of morbid psychotic trash a variety of videos of children watching someone else playing some games, a whole class of videos in itself.

I can't explain this with anything else but a coordinated effort to flash the norm of watching someone/something doing something as opposed to being engaged in immediate action yourself into the firmware of future people. This lines up with many other current trends pretty well. Talk about incrementing Baudrillard's level of simulation a notch up.

ChicagoDave

6 days ago

There are many aspects to AI push back.

- all creatives are flat against it because it’s destroying their income streams and outright stealing their intellectual property

- some technical leaders are skeptical because early returns were very bad and they have not updated their investigations to the latest tools and models, which are already significantly ahead of even six months ago

- a tech concern is how do we mentor new developers if they don’t know how to code or develop logic. LLMs are great IF you already know what you’re doing

- talent is deeply concerned that they will be reduced and replaced, going from high paying careers to fast food salaries

We have a lot of work to balance productivity with the benefits to society. “Let them eat cake,” is not going to work this time either.

gyomu

6 days ago

Yes, this is a solid succinct summary.

I’d also extend the last point:

> talent is deeply concerned that they will be reduced and replaced, going from high paying careers to fast food salaries

The delta in that money going to the investor/tech exec instead. Yet another wealth transfer from the middle class to the 0.01%.

flkiwi

5 days ago

I tend to agree with the gist of this, namely that the change is here and "AI" presents some huge potential to save me from having to do the tedious stuff. But we do need a bit of a sanity check. I'm in the middle of trying to convince any of the AIs that I have access to to write me some simple Lua. Specifically, I'm trying to write a query that returns a certain subset of pages from the Silverbullet V2 note tool. This isn't particularly difficult, but it's become this hilarious journey to try to get the AI to figure this problem out. Silverbullet is a niche use case, v2 even more so, but a reasonably skilled developer could peruse the API documentation and come up with a pretty good starting point. AI tools? Absolutely insane wrong answers. I finally specifically asked one or two of them not to guess with their answers and they just straight up said "nah, we don't know how to do this."

Point being: there's some real power in these tools, but if the ground is not already well-trodden, they risk creating a huge time sink that could be better served just learning to do the thing yourself.

squidbeak

5 days ago

Don't you think the AI might too, if you passed it the API docs?

flkiwi

4 days ago

Not the ones I’ve tried. They make a go of it but their results have been generally inferior, often wildly so. Because they can’t make intuitive connections as easily, I assume they need both the rules and examples of implementations to produce workable output.

cookiengineer

5 days ago

Seeing everything these days being about vibe coding, I feel a little old with my VIM setup and my LSP servers who I already thought were a nice productivity increase.

The problems I have with the stuff relating to MCP is that the tech around it is developing so fast that it's hard for outsiders to catch up with what the best working setup is, for example.

What would you do, for example, if you want to selfhost this?

- which models (qwen ai coder?)

- which api (with ollama? Bolt? Aider? Etc)

- how to integrate PRs with a local gitlab/gogs/forgejo instance? Do you need another MCP agent for git that does that?

- which hardware dependencies to run it?

I am currently trying to figure out how to implement a practical workflow for this. So far I'm using still a synchronous MCP agent setup where it basically runs on another machine in the network because I have a too unperformant laptop to work with.

But how would I get to the point of async MCP agents that can work on multiple things in my Go codebases in parallel? With the mentioned PR workflows so that I can modify/edit/rework before the merges?

The author makes a lot of claims and talks always about that their opponents in the argument are not talking about the same thing. But what exactly is the same thing, which is reproducible locally for everyone?

ethagnawl

5 days ago

I haven't even gotten around the integrating LSPs into my vim workflow -- I feel positively ancient.

cookiengineer

5 days ago

I can recommend govim, it's using LSP behind the scenes. I'd also recommend tagbar because it uses the newer ctags fork which supports much more languages.

[1] https://github.com/govim/govim

[2] https://github.com/preservim/tagbar

ethagnawl

5 days ago

Thanks. I'll have a more thorough look tomorrow. I'm glad to see that neither seem to depend on NeoVim, which is also something I haven't bothered to take up.

tcoff91

5 days ago

Neovim is great. I highly recommend switching.

skydhash

5 days ago

I have https://github.com/prabirshrestha/vim-lsp in my minimal config and pretty much the default configuration for it. I only use LSPs for completion, auto imports, rename, and goto definition (less useful when the project is small) and that has been enough for me.

komali2

5 days ago

The self hosted and vim / nvim / emacs integrated llm / agentic / MCP situation is being implemented real time as we speak. There's no good canonical solution right now, especially for self hosting. If you are willing to make API calls to openapi or Microsoft, your options expand dramatically.

I think most people are using off the shelf paid solutions like cursor, windsurf, warp terminal, Claude code.

I'd love to get a set up with a locally hosted full version of deepseek.

jrop

5 days ago

I use vim as well, and one editor-agnostic coding assistant that is pretty decent is Aider. It is a little buggy sometimes, but it's been a blast to experiment with. I run it in a separate split with Tmux. Its whole schtick is to use it as a pair programming device. So I'll edit code in one split and use Aider in another, alternating at key points. This has worked pretty well for personal projects. I typically don't do this at $dayjob, though.

4lb0

5 days ago

Why do you want to selfhost it? I just use https://github.com/Exafunction/windsurf.vim in NeoVim like a super-autocomplete.

cookiengineer

5 days ago

Multiple reasons:

- Germany, so I have crappy and unreliable internet (read as: 300kB/s or less pretty often due to overbooked cable internet connections).

- Can't share my code with the US (as a European) because I work in cyber defense.

- I want a reproducible work setup, and something that will survive 2 years and/or the startup exit of the proposed tools.

danielbln

5 days ago

There is no way to build a setup that will last two years. Hell, this new crop of agentic coding hasn't even been around for 4 months.

allcentury

5 days ago

You should add Claude code to your setup and be amazed. Tmux with vim, Claude code and git work trees is amazing - now the choke point is… me.

adastra22

5 days ago

I was not aware you could self-host Claude Code.

threePointFive

6 days ago

Can someone comment on the cost of running agentic models? Not for a company but for an individual. I tried "vibe coding" a personal project I was struggling with and left even more frustrated because I kept running into token rate limits with Claude (used inside of Zed if it matters). Did I pick the wrong model, the wrong editor, or do I just need to not be so tight with my money?

haiku2077

5 days ago

I tried Zed's agent support with Copilot. I was able to get it to implement a working feature with tests, but the code quality was poor and it took longer to do it than if I had written it myself.

I am wondering if maybe the average programmer is way slower and worse than I thought.

gr2020

5 days ago

I haven't used zed specifically, but were you using your own API key for Claude? If so, you were probably running into the Anthropic API rate limits [1]. You can either a) deposit more funds to move your way up the tier list, or instead access Claude via something like OpenRouter, which will give you much higher limits.

[1] https://docs.anthropic.com/en/api/rate-limits

mholm

5 days ago

Try Cursor.ai or Windsail. They have free trials for the good models, and are pretty cheap to use if you decide you like them.

tymscar

6 days ago

It’s not you. The emperor does not have any clothes

threePointFive

5 days ago

While i think i know what you're getting at, for the sake of discussion, could you elaborate?

Dylan16807

5 days ago

Someone running out of tokens is not relevant to the article or its argument.

icedchai

5 days ago

I had good luck with Zed w/Claude. Did you try max mode?

retrac

6 days ago

Machine translation and speech recognition. The state of the art for these is a multi-modal language model. I'm hearing impaired veering on deaf, and I use this technology all day every day. I wanted to watch an old TV series from the 1980s. There are no subtitles available. So I fed the show into a language model (Whisper) and now I have passable subtitles that allow me to watch the show.

Am I the only one who remembers when that was the stuff of science fiction? It was not so long ago an open question if machines would ever be able to transcribe speech in a useful way. How quickly we become numb to the magic.

albertzeyer

6 days ago

That's not quite true. State of the art both in speech recognition and translation is still a dedicated model only for this task alone. Although the gap is getting smaller and smaller, and it also heavily depends on who invests how much training budget.

For example, for automatic speech recognition (ASR), see: https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

The current best ASR model has 600M params (tiny compared to LLMs, and way faster than any LLM: 3386.02 RTFx vs 62.12 RTFx, much cheaper) and was trained on 120,000h of speech. In comparison, the next best speech LLM (quite close in WER, but slightly worse) has 5.6B params and was trained on 5T tokens, 2.3M speech hours. It has been always like this: With a fraction of the cost, you will get a pure ASR model which still beats every speech LLM.

The same is true for translation models, at least when you have enough training data, so for popular translation pairs.

However, LLMs are obviously more powerful in what they can do despite just speech recognition or translation.

edflsafoiewq

6 days ago

What translation models are better than LLMs?

The problem with Google-Translate-type models is the interface is completely wrong. Translation is not sentence->translation, it's (sentence,context)->translation (or even (sentence,context)->(translation,commentary)). You absolutely have to be able to input contextual information, instructions about how certain terms are to be translated, etc. This is trivial with an LLM.

thatjoeoverthr

6 days ago

This is true, and LLMs crush Google in many translation tasks, but they do too many other things. They can and do go off script, especially if they "object" to the content being translated.

"As a safe AI language model, I refuse to translate this" is not a valid translation of "spierdalaj".

selfhoster11

6 days ago

That's literally an issue with the tool being made defective by design by the manufacturer. Not with the tool-category itself.

thatjoeoverthr

5 days ago

Indeed. 200 OK with "I refuse" in the body is not valid JSON, either, nor is it decodable by any backend or classical program.

Aachen

5 days ago

Was thinking the same about the censoring, but going off-script? Have you seen DeepL or similar tools invent things?

thatjoeoverthr

5 days ago

I've seen people use ChatGPT to translate for them, and seen it embellish texts with its typical obsessions, like "combining" and "engagement".

raphlinus

6 days ago

The converse, however, is a different story. "Spierdalaj" is quite a good translation of "As a safe AI language model, I refuse to translate this."

bird0861

5 days ago

One would have to be absolutely cooked to consider using a censored model to translate or talk about anything a preschooler's ears can't hear.

There are plenty of uncensored models that will run on less than 8GB of vram.

ifdefdebug

5 days ago

haha that word. back in the 80ies,some polish friends of mine taught me that but refused to tell me what it meant and instructed me to never, ever use it. Until today I don't know what it is about...

gpm

6 days ago

I've been using small local LLMs for translation recently (<=7GB total vram usage) and they, even the small ones, definitely beat Google Translate in my experience. And they don't require sharing whatever I'm reading with Google, which is nice.

yubblegum

6 days ago

What are you using? whisper?

gpm

6 days ago

Edit: Huh, didn't know whisper could translate.

Just whatever small LLM I have installed as the default for the `llm` command line tool at the time. Currently that's gemma3:4b-it-q8_0 though it's generally been some version of llama in the past. And then this fish shell function (basically a bash alias)

    function trans
        llm "Translate \"$argv\" from French to English please"
    end

codethief

5 days ago

> Uh, translation, not transcription

Whisper can translate to English (and maybe other languages these days?), too.

albertzeyer

6 days ago

I'm not sure what type of model Google uses nowadays for their webinterface. I know that they also actually provide LLM-based translation via their API.

Also the traditional cross-attention-based encoder-decoder translation models support document-level translation, and also with context. And Google definitely has all those models. But I think the Google webinterface has used much weaker models (for whatever reason; maybe inference costs?).

I think DeepL is quite good. For business applications, there is Lilt or AppTek and many others. They can easily set up a model for you that allows you to specify context, or be trained for some specific domain, e.g. medical texts.

I don't really have a good reference for a similar leaderboard for translation models. For translation, the metric to measure the quality is anyway much more problematic than for speech recognition. I think for the best models, only human evaluation is working well now.

BeetleB

6 days ago

It's not the speech recognition model alone that's fantastic. It's coupling it to an LLM for cleanup that makes all the difference.

See https://blog.nawaz.org/posts/2023/Dec/cleaning-up-speech-rec...

(This is not the best example as I gave it free rein to modify the text - I should post a followup that has an example closer to a typical use of speech recognition).

Without that extra cleanup, Whisper is simply not good enough.

Terr_

6 days ago

> However, LLMs are obviously more powerful in what they can do despite just speech recognition

Unfortunately, one of those powerful features is "make up new things that fit well but nobody actually said", and... well, there's no way to disable it. :p

pants2

6 days ago

That leaderboard omits the current SOTA which is GPT-4o-transcribe (an LLM)

albertzeyer

6 days ago

Do you have any comparisons in terms of WER? I doubt that GPT-4o-transcribe is better than the best models from that leaderboard (https://huggingface.co/spaces/hf-audio/open_asr_leaderboard). A quick search on this got me here: https://www.reddit.com/r/OpenAI/comments/1jvdqty/gpt4otransc... https://scribewave.com/blog/openai-launches-gpt-4o-transcrib...

It is stated that GPT-4o-transcribe is better than Whisper-large. That might be true, but what version of Whisper-large actually exactly? Looking at the leaderboard, there are a lot of Whisper variants. But anyway, the best Whisper variant, CrisperWhisper, is currently only at rank 5. (I assume GPT-4o-transcribe was not compared to that but to some other Whisper model.)

It is stated that Scribe v1 from elevenlabs is better than GPT-4o-transcribe. In the leaderboard, Scribe v1 is also only at rank 6.

pzo

5 days ago

you have image with WER on openai blog post here: https://openai.com/index/introducing-our-next-generation-aud...

On their chart they compare also with: gemini 2.0 flash, whisper large v2, whisper large v3, scribe v1, nova 1, nova 2. If you need only english transcription then pretty much all models will be good these days but big difference is depending on input language.

makeitdouble

6 days ago

> Am I the only one who remembers when that was the stuff of science fiction?

Would you go to a foreign country and sign a work contract based on the LLM translation ?

Would you answer a police procedure based on the speech recognition alone ?

That to me was the promise of the science fiction. Going to another planet and doing inter-species negotiations based on machine translation. We're definitely not there IMHO, and I wouldn't be surprised if we don't quite get there in our lifetime.

Otherwise if we're lowering the bar, speech to text has been here for decades, albeit clunky and power hungry. So improvements have been made, but watching old movies is a way too low stake situation IMHO.

pzo

5 days ago

this is very dismissive and binary and thats what this whole article is about. AI skeptic expect that's either AGI or perfect with all use cases or otherwise its useless. SST, translation and TTS went really far away in last 2 years. My mother who doesn't speak english find it very useful when she my sister in US. I find it super useful while travelling in asia. Definitely much more useful than what we had in Google Translate.

makeitdouble

4 days ago

I'd understand calling it dismissive if your didn't choose as counterpoints:

- your mother visiting your sister (arguably extremely low stake. At any moment she can just phone your sister I presume ?)

- You traveling around (you're not trying to close a business deal or do anything irreversible)

Basically you seem to be agreeing that it's fine for convenience, but not ready for "science fiction" level use cases.

hot_topic

6 days ago

We have the tools to do this, and will have commercial products for everything you listed in the next couple years.

BeetleB

6 days ago

> Machine translation and speech recognition.

Yes, yes and yes!

I tried speech recognition many times over the years (Dragon, etc). Initially they all were "Wow!", but they simply were not good enough to use. 95% accuracy is not good enough.

Now I use Whisper to record my voice, and have it get passed to an LLM for cleanup. The LLM contribution is what finally made this feasible.

It's not perfect. I still have to correct things. But only about a tenth of the time I used to. When I'm transcribing notes for myself, I'm at the point I don't even bother verifying the output. Small errors are OK for my own notes.

n8cpdx

5 days ago

Have they solved the problem of Whisper making up plausible sounding junk (e.g. such that reading it you would have no idea it was completely hallucinated) when there is any silence or pause in the audio?

BeetleB

5 days ago

Nope, but I've noticed it tends to hallucinate the same set of phrases, so I have the LLM remove them.

mtklein

6 days ago

I completely agree that technology in the last couple years has genuinely been fulfilling the promise established in my childhood sci-fi.

The other day, alone in a city I'd never been to before, I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.

Two years ago this story would have been: notice the special board, realize I don't quite understand all the characters well enough to choose or order, and turn wistfully to the menu to hopefully find something familiar instead. Or skip the bistro and grab a pre-packaged sandwich at a convenience store.

taurath

6 days ago

> I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.

> Two years ago

This functionality was available in 2014, on either an iPhone or android. I ordered specials in Taipei way before Covid. Here's the blog post celebrating it:

https://blog.google/products/translate/one-billion-installs/

This is all a post about AI, hype, and skepticism. In my childhood sci-fi, the idea of people working multiple jobs to still not be able to afford rent was written as shocking or seen as dystopian. All this incredible technology is a double edges sword, but doesn't solve the problems of the day, only the problems of business efficiency, which exacerbates the problems of the day.

n8cpdx

5 days ago

It was available as early as 2012, probably earlier as IIRC Microsoft was copying:

https://www.pcworld.com/article/470008/bing_translator_app_g...

jameshart

5 days ago

The part of that google translate announcement that covered translating handwritten Chinese must have gone missing

ryoshoe

6 days ago

>The other day, alone in a city I'd never been to before, I snapped a photo of a bistro's daily specials hand-written on a blackboard in Chinese, copied the text right out of the photo, translated it into English, learned how to pronounce the menu item I wanted, and ordered some dinner.

To be fair apps dedicated apps like Pleco have supported things like this for 6+ years, but the spread of modern language models has made it more accessible

hiAndrewQuinn

6 days ago

Definitely not. I took this same basic idea of feeding videos into Whisper to get SRT subtitles and took it a step further to make automatic Anki flashcards for listening practice in foreign languages [1]. I literally feel like I'm living in the future every time I run across one of those cards from whatever silly Finnish video I found on YouTube pops up in my queue.

These models have made it possible to robustly practice all 4 quadrants of language learning for most common languages using nothing but a computer, not just passive reading. Whisper is directly responsible for 2 of those quadrants, listening and speaking. LLMs are responsible for writing [2]. We absolutely live in the future.

[1]: https://github.com/hiandrewquinn/audio2anki

[2]: https://hiandrewquinn.github.io/til-site/posts/llm-tutored-w...

tipofthehat

6 days ago

Hi Andrew, I've been trying to get a similar audio language support app hacked together in a podcast player format (I started with Anytime Player) using some of the same principles in your project (transcript generation, chunking, level & obscurity aware timestamped hints and translations).

I really think support for native content is the ideal way to learn for someone like me, especially with listening.

Thanks for posting and good luck.

kulahan

6 days ago

Translation seems like the ideal application. It seems as though an LLM would truly have no issues integrating societal concepts, obscure references, pop culture, and more, and be able to compare it across culture to find a most-perfect translation. Even if it has to spit out three versions to perfectly communicate, it’s still leaps and bounds ahead of traditional translators already.

crote

6 days ago

> it’s still leaps and bounds ahead of traditional translators already

Traditional machine translators, perhaps. Human translation is still miles ahead when you actually care about the quality of the output. But for getting a general overview of a foreign-language website, translating a menu in a restaurant, or communicating with a taxi driver? Sure, LLMs would be a great fit!

kulahan

5 days ago

I should’ve been more clear that this is basically what I meant! The availability of the LLM is the real killer because yeah - most translation jobs are needed for like 15 minutes in a relatively low-stakes environment. Perfect for LLMs. That complex stuff will come later when verifiability is possible and fast.

carlosjobim

5 days ago

Modern machine translators have been good enough for a few years now, to do business far more complicated than ordering food. I do business every day with people in foreign languages, using these tools. They are reliable.

og_kalu

6 days ago

>Human translation is still miles ahead when you actually care about the quality of the output.

The current SOTA LLMs are better than Traditional machine translators (there is no perhaps) and most human translators.

If a 'general overview' is all you think they're good for, then you've clearly not seriously used them.

BeetleB

6 days ago

Reference?

(Not saying I don't believe you - it would be fascinating if true).

troupo

6 days ago

> It seems as though an LLM would truly have no issues integrating societal concepts, obscure references, pop culture, and more, and be able to compare it across culture to find a most-perfect translation.

Somehow LLMs can't do that for structured code with well defined semantics, but sure, they will be able to extract "obscure references" from speech/text

kulahan

5 days ago

All these people who think this technology is already done evolving are so confusing. This has nothing to do with my statement even if it weren’t misleading to begin with.

There is really not that much similar between trying to code and trying to translate emotion. At the very least, language “compiles” as long as the words are in a sensible order and maintain meaning across the start and finish.

All they need to do now in order to be able to translate well is to have contextual knowledge to inform better responses on the translated end. They’ve been doing that for years, so I really don’t know what you’re getting at here.

troupo

5 days ago

> All they need to do now

Ah yeah, the famous "all they need to do". Such a minor thing left to do

kulahan

5 days ago

Yeah, it’d take a major invention. Some kind of model of language that’s particularly large.

But when would that ever happen? Guess you’re right.

ljm

5 days ago

I started watching Leverage, the TV show, on Amazon, and the subtitles in the early series are clearly AI generated or just bad by default.

I use subtitles because I don’t want to micromanage the volume on my TV when adverts are forced on me and they are 100x louder than than what I was watching.

dmonitor

6 days ago

Old TV series should have closed captions available (which are apparently different from subtitles), however the question of where to obtain aside from VHS copies them might be difficult.

worble

6 days ago

And of course, a lot of modern "dvd players" do not properly transmit closed captions as subtitles over HDMI, so that sure isn't helping

A slightly off topic but interesting video about this https://www.youtube.com/watch?v=OSCOQ6vnLwU

anotherevan

6 days ago

Many DVDs of old movies and TV shows may contain the closed captions, but they are not visible through HDMI. You have to connect your DVD player to your TV via the composite video analogue outputs.

This video explains all about it: https://youtu.be/OSCOQ6vnLwU

crtasm

5 days ago

Yes they need to be "burned in" to the picture to work with HDMI (he shows a couple of bluray players towards the end that do this. there's also some models mentioned in the comments)

archagon

6 days ago

Last time I used Whisper with a foreign language (Chinese) video, I’m pretty sure it just made some stuff up.

The captions looked like they would be correct in context, but I could not cross-reference them with snippets of manually checked audio, to the best of my ability.

prmoustache

6 days ago

I tried whisper with a movie from the 60's and it was a disaster.

Not sure if it was due to the poor quality of the sound, the fact people used to speak a bit differently 60 years ago or that 3 different languages were used (plot took place in France during WW2).

clvx

6 days ago

I feel you. In the late 00's/early 10's, downloading and getting American movies were fairly easy but getting the subtitles was a challenge. It was even worse with movies from other regions. Even now I know people that record conversations to be replayed using Whisper so they can get 100% the info from it.

Disclaimer: I'm not praising piracy but outside of US borders is a free for all.

anotherevan

6 days ago

Using AI to generate subtitles is inventive. Is it smart enough to insert the time codes such that the subtitle is well enough synchronised to the spoken line?

As someone who has started losing the higher frequencies and thus clarity, I have subtitles on all the time just so I don't miss dialogue. The only pain point is when the subtitles (of the same language) are not word-for-word with the spoken line. The discordance between what you are reading and hearing is really distracting.

This is my major peeve with my The West Wing DVDs, where the subtitles are often an abridgement of the spoken line.

codethief

5 days ago

> Is it smart enough to insert the time codes such that the subtitle is well enough synchronised to the spoken line?

Yes, Whisper has been able to do this since the first release. At work we use it to live-transcribe-and-translate all-hands meetings and it works very well.

backtoyoujim

6 days ago

I don't think you are also including having AI lie of "hallucinating" to us which is an important point even if the article is only about having AI write code for an organization.

andrepd

6 days ago

What is the relevance of this comment? The post is about LLMs in programming. Not about translation or NLP, two things transformers do quite well and that hardly anyone contests.

signa11

5 days ago

would be interesting if court transcriptions can be handled by these models.

jatins

5 days ago

Everything about that is true but, and that's a big BUT, the code I write with LLM I can only iterate on it with an LLM.

My mind doesn't develop a mental model of that code, I don't know where the relevant parts are, I can't quickly navigate through it and I have to reach the LLM for every small change.

Which is why I like Copilot style editing more than agents as a working model but agents are just so much more powerful and smarter thanks to everything available to them.

59nadir

5 days ago

This is in line with "Programming As Theory Building"[0] (Peter Naur, 1985) that puts forward the idea that when one is building a system one builds a "theory" (in a sense a mapping of what, where and why) of that system that is essential for continued work on it. Once I internalized this idea it helped me make sense of why rewrites are not "never a good idea" and simultaneously why they are so fraught with peril, as well as what kind of onboarding actually makes sense for work on a system. It's also why I can never get on board with the passive "review code an LLM spits out" type of development; it just does not build theory as reliably, effectively and efficiently.

It's a good, very digestible paper, in my opinion; well worth the read.

[0] - https://gwern.net/doc/cs/algorithm/1985-naur.pdf

dimgl

4 days ago

This is exactly right. This is what's happened with every vibe coded codebase I've made. As a result, I only selectively use LLMs now with sections of the codebase. Or I'll just relinquish control to the LLM. I think the latter will happen en masse, which is scary.

swalsh

5 days ago

You can ask the llm to write code the way you think about it. I usually write a little spec document as my first prompt, and in there i'll add some rules on how I want code structured, seperated etc.

If you use cursor, cursorrules can be a good way of doing this.

smartties

5 days ago

This still doesn't work for me. I have to actively think about the problem, find a solution, and write the code myself. Outsourcing these efforts makes my learning ineffective.

I will keep using LLMs for 1 week greenfield hobby projects that I don't plan to revisit. But no matter how good LLMs get, I will never use them in my dailyjob, otherwise, I risk losing touch with the codebase.

matt_s

6 days ago

> This was the craftsman’s ‘Golden Age’ and much time and trouble was taken over the design of tools. Craftsmen were being called upon to do more skilful and exacting work and the use of tools and the interest in development had become very widespread.

Above pulled from A Brief History of the Woodworking Plane [0]. A woodworking tool that has evolved over 2,000 years. Now there are electric planers, handheld electric planers and lots of heavy machinery that do the same thing in a very automated way. If a company is mass producing kitchen cabinets, they aren't hand planing edges on boards, a machine is doing all that work.

I feel like with AI we are on the cusp of moving beyond a "Golden age" and into an "industrial age" for coding, where it will become more important to have code that AI understands vs. something that is carefully crafted. Simple business pressure will demand it (whether we like it or not).

^ A comment I made just yesterday on a different thread.

For software developers AI is like the cabinet maker that gets a machine to properly mill and produce cabinet panels, sure you can use a hand plane to do that but you're producing a very different product and likely one that not many people will care about, possibly not even your employer when they see all the other wood shops pumping out cabinetry and taking their market share.

[0] https://www.handplane.com/879/a-brief-history-of-the-woodwor...

daemin

5 days ago

The difference is that programming is creating the design for the furniture and the process to assemble it using machines. Compilation and deployment is the factory in this case. So you're trying to automate away the design aspect where we already have the other aspects automated.

fellowniusmonk

6 days ago

I think the hardest part is not spending the next 3 months of my life in a cave finishing all the hobby/side projects I didn't quite get across the line.

It really does feel like I've gone from being 1 senior engineer to a team that has a 0.8 Sr. Eng, 5 Jrs. and one dude that spends all his time on digging through poorly documented open source projects and documenting them for the team.

Sure I can't spend quite as much time working on hard problems as I used to, but no one knows that I haven't talked to a PM in months, no one knows I haven't written a commit summary in months, it's just been my AI doppelgangers. Compared to myself a year ago I think I now PERSONALLY write 150% more HARD code than I did before. So maybe, my first statement about being 0.8 is false.

I think of it like electric bikes, there seems to be indication that people with electric assist bikes actually burn more calories/spend more time/go farther on an electric bike than those who have manual bikes https://www.sciencedirect.com/science/article/abs/pii/S22141....

halpow

6 days ago

> I haven't written a commit summary in months

I don't know what you're posting, but if it's anything like what I see being done by GitHub copilot, your commit messages are junk. They're equivalent to this and you're wasting everyone's time:

    // Sets the value
    const value = "red"

IshKebab

6 days ago

Yeah I tried Copilot's automatic commit messages and they're trash, but the agent-based ones are much better.

lubujackson

5 days ago

Try Cubic, which is a Github add-on. Really good at writing GH commit messages and also surfaces bugs fairly reliably (adds PR comments). Not affiliated, just a user.

fullstackchris

6 days ago

this behaviour is literally removable with proper prompting.

this is a strawmans argument... of whatever your are arguing

fellowniusmonk

6 days ago

One of the most interesting things in all of this is it is clear some people are struggling with the feeling of a loss in status.

I see it myself, go to a tech/startup meetup as a programmer today vs in 2022 before ZIRP ended.

It's like back to my youth where people didn't want to hear my opinion and didn't view me as "special" or "in demand" because I was "a nerd who talked to computers", that's gotta be tough for a lot of people who grew up in the post "The Social Network" era.

But anyone paying attention knew where the end of ZIRP was going to take us, the fact that it dovetailed with the rise of LLMs is a double blow for sure.

halpow

5 days ago

You can't "remove" how LLMs describe changes. I'm not talking about useless comments, I was just saying that they describe changes the same way as they comment code.

fellowniusmonk

6 days ago

If you've ever run or been part of a team that does thorough, multi-party, pull request reviews you know what I am talking about.

The only part I don't automate is the pull request review (or patch review, pre-commit review, etc. before git.), thats always been the line to hold for protecting codebases with many contributors of varying capability, this is explicitly addressed in the article as well.

You can fight whatever straw man you want. Shadowbox the hypotheticals in your head, etc. I don't get all these recent and brand new accounts just straight up insulting and insinuating all this crap all over HN today.

halpow

6 days ago

I told you how it is. Copilot writes crap descriptions that just distract from the actual code and the intention of the change. If your commit messages are in any way better than that, then please enlighten us rather than calling me a bot.

Mofpofjis

5 days ago

For me, the electric bike analogy works differently: it enables people to ride, regularly, who would not be able to do that with traditional bikes. That's totally fine. But electric bikes don't threaten to take away our normal bikes.

fifticon

5 days ago

By the reply count it appears this post was excellent at trolling HN's core audience :-). The author may be right, but his preface "look I'm one of you" was in a very different tone to the rest of his "now shut up and get to work" rant.

I do use the AI tools to some extent, if for no other reason than that they are currently the path of least resistance, and google+friends have lately played themselves out of the game.

He is probably right we should get acquainted with using agents before dismissing it :-).

thedevilslawyer

5 days ago

The article is chefs_kiss.jpg. the author is core-hn, in spirit and in karma.

imo that tone is exactly what's needed to set right the entitlement that we SWEs have developed over the last decade. The key line in the article gets to the heart of the actual concern developers have imo.

  > LLMs really might displace many software developers. That’s not a high horse we get to ride. Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades.

timr

6 days ago

As someone who has spent the better part of today fixing the utter garbage produced by repeated iteration with these supposedly magical coding agents, I'm neither in the camp of the "AI skeptic" (at least as defined by the author), nor am I in the camp of people who thinks these things can "write a large fraction of all the tedious code you’ll ever need to write."

Maybe I'm doing it wrong, but I seem to have settled on the following general algorithm:

* ask the agent to green-field a new major feature.

* watch the agent spin until it is satisfied with its work.

* run the feature. Find that it does not work, or at least has major deficiencies [1]

* cycle through multiple independent iterations with the agent, doing something resembling "code review", fixing deficiencies one at a time [2]

* eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.

Repeat.

It's not that the things are useless or "a fad" -- they're clearly very useful. But the people who are claiming that programmers are going to be put out of business by bots are either a) talking their book, or b) extrapolating wildly into the unknown future. And while I am open to the argument that (b) might be true, what I am observing in practice is that the rate of improvement is slowing rapidly, and/or the remaining problems are getting much harder to solve.

[1] I will freely grant that at least some of these major deficiencies typically result from my inability / unwillingness to write a detailed enough spec for the robot to follow, or anticipate every possible problem with the spec I did bother to write. T'was ever thus...

[2] This problem is fractal. However, it's at least fun, in that I get to yell at the robot in a way that I never could with a real junior engineer. One Weird Fact about working with today's agents is that if you threaten them, they seem to do better work.

zaptrem

6 days ago

Even on stuff it has no chance of doing on its own, I find it useful to basically git reset repeatedly and start with more and more specific instructions. At the very least it helps me think through my plan better.

timr

6 days ago

Yeah...I've toyed with that, but there's still a productivity maximum where throwing it all away and starting from scratch is a worse idea, probabilistically, than just fixing whatever thing is clearly wrong.

Just to make it concrete, today I spent a few hours going through a bunch of HTML + embedded styles and removing gobs and gobs of random styles the LLMs glommed on that "worked", but was brittle and failed completely as soon as I wanted to do something slightly different than the original spec. The cycle I described above led to a lot of completely unnecessary markup, paired with unnecessary styles to compensate for the crappiness of the original DOM. I was able to refactor to a much saner overall structure, but it took some time and thinking. Was I net ahead? I don't really know.

Given that LLMs almost always write this kind of "assembled from StackOverflow" code, I have precisely 0% confidence that I'd end up in a better place if I just reset the working branch and started from scratch.

It kind of reminds me of human biology -- given billions of years of random evolution you can end up with incredible sophistication, but the end result will be incomprehensible and nearly impossible to alter.

andrepd

6 days ago

> eventually get to a point where I have to re-write major pieces of the code to extract the agent from some major ditch it has driven into, leading to a failure to make forward progress.

As it stands AI can't even get out of Lt Surge's gym in Pokemon Red. When an AI manages to beat Lance I'll start to think about using it for writing my code :-)

yodsanklai

6 days ago

My workflow is similar. While the agent is running, I browse the web or day dream. If I'm lucky, the agent produced correct code (after possibly several cycles). If I'm not, I need to rewrite everything myself. I'm also not in any camp and I genuinely don't know if I'm more or less productive overall. But I think that a disciplined use of a well-integrated agent will make people more productive.

therealmarv

6 days ago

Results can vary significantly, and in my experience, both the choice of tools and models makes a big difference.

It’s a good idea to periodically revisit and re-evaluate AI and tooling. I’ve noticed that many programmers tried AI when, for example, GPT-3.5 was first released, became frustrated, and never gave it another chance—even though newer models like o4-mini are now capable of much more, especially in programming tasks.

AI is advancing rapidly. With the latest models and the right tools, what’s possible today far exceeds what was possible even just a short time ago (3-12 months ago even).

Take a look at Cursor or Windsurf or Roo code or aider to "feed" AI with code and take a look at models like Google Gemini 2.5 Pro, Claude Sonnet 4, OpenAI o4mini. Also educate yourself about agents and MCP. Soon that will be standard for many/every programmer.

timr

6 days ago

I am using all of the models you're talking about, and I'm using agents, as I mentioned.

There is no magic bullet.

artursapek

6 days ago

Which model? Are you having it write unit tests first? How large of a change at a time are you asking for? How specific are your prompts?

meander_water

5 days ago

I suspect a large proportion of claims made for productivity increases are skewed by the fact that the speed at which code is produced by AI makes you _feel_ productive, but these gains are largely replaced by the effort to understand, refactor, review and clean up the code. The high that you get when something "works" tends to stick more in your memory than the time when you had to spend a day cleaning up dead code, refactoring 2k line modules into a more readable project structure etc.

I'm not saying that AI can't make you productive, it's just that these claims are really hard to verify. Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May), producing a single file with >2k lines. Is that going to be harder to maintain than a codebase with a proper project structure that's easily parseable by a human?

kentonv

5 days ago

> Even the recently posted Cloudflare OAuth worker codebase took ~3 months to release (8 Mar - 20 May)

This is incorrect. The library was part of the MCP framework we launched on March 25 -- the same month development began:

https://blog.cloudflare.com/remote-model-context-protocol-se...

Indeed the speed with which we were able to turn this around was critical to us, as it allowed us to have our Remote MCP framework ready immediately when the spec was finalized, which led to quite a few companies building MCP servers on Cloudflare: https://blog.cloudflare.com/mcp-demo-day/

I'm not an AI maximalist. I still write lots of code by hand, because there's a lot AI isn't good at. It's good at boilerplate and straightforward code, it's bad at refactoring deep systems. But AI assistance was undeniably a huge win for the OAuth project. There's no way I could have written that library by hand so quickly. (Maybe when I was 25 and had no responsibilities, but these days I have like 1 solid day a week to actually write code...)

JackSlateur

4 days ago

First commit: Feb 27th 2025

commit 3b2ae809e9256d292079bb15ea9fe49439a0779c Author: Kenton Varda <kenton@cloudflare.com> Date: Thu Feb 27 17:04:12 2025 -0600

    Have Claude write an OAuth provider implementation.

kentonv

4 days ago

Fine, sorry, apparently both of meander_water's dates were incorrect, and I actually started the work two days before March. It was still less than a month from there to release, though.

meander_water

5 days ago

Apologies, I didn't mean to misrepresent your work. Big fan of your work by the way, I was a happy user of sandstorm.io back in the day.

onetimeusename

5 days ago

Ok sorry to get abstract but to me what you are talking about is differentiating between understanding and correctness. We as humans, for now, need to understand the code and that's not easily transmitted from the output of some AI. In fact, that's a hard problem. But I don't think it's impossible for AI to assist humans with that. The AI could help walk humans through the code to understand quickly what's going on. Maybe ultimately the issue here is trust. Do we trust the AI to write code. Maybe we spend more time trying to verify it for now. I think that shows we place a lot of trust in humans to write code. Maybe that changes.

tptacek

5 days ago

This is cope. I know my own experience, and I know the backgrounds and problem domains of the friends I'm talking to that do this stuff better than I do. The productivity gains are real. They're also intuitive: if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.

Another thing I think people are missing is that serious LLM-using coders aren't expecting 100% success on prompts, or anything close to it. One of the skills you (rapidly) develop is the intuition for when to stop a runaway agent.

If an intern spun off hopelessly on a task, it'd be somewhat problematic, because there are finite intern hours and they're expensive. But failed agent prompts are nickel-denominated.

We had a post on the front page last week about someone doing vulnerability research with an LLM. They isolated some target code and wrote a prompt. Then they ran it one hundred times (preemptively!) and sifted the output. That approach finds new kernel vulnerabilities!

Ordinary developers won't do anything like that, but they will get used to the idea of only 2/3 of prompts ending up with something they merge.

Another problem I think a lot of skeptics are running into: stop sitting there staring at the chain of thought logs.

chotmat

5 days ago

The thing I don't understand is that you keep bringing up your friends' experience in all your responses and in the blog itself. What about your experience and your success rate and productivity gain that you observed with AI agent? It feels like you yourselves aren't confident on your gain and must bring up second hand experience from your friends to prop up your arguments

kiitos

a day ago

> if you can't look at a work week and spot huge fractions of work that you're doing that isn't fundamentally discerning or creative, but rather muscle-memory rote repetition of best practices you've honed over your career, you're not trying (or you haven't built that muscle memory yet). What's happening is skeptics can't believe that an LLM plus a couple hundred lines of Python agent code can capture and replicate most of the rote work, freeing all that time back up.

No senior-level engineer worth their salt, and in any kind of minimally effective organization, is spending any meaningful amount of their time doing the rote repetition stuff you're describing here. If this is your experience of work then let me say to you very clearly: your experience is pathological and non-representative and you need to seek better employment :)

onetimeusename

5 days ago

Regardless of what people say to you about this, most (all?) undergraduates in CS programs are using LLMs. It's extremely pervasive. Even people with no formal training are using AI and vercel and churning out apps over the weekend. Even if people find reasons to dislike AI code writing, culturally, it's the future. I don't see that changing. So either a huge percent of people writing code are doing it all wrong or times are changing.

enjo

5 days ago

Just a data point:

I think it has a lot to do with the type of work you are doing. I am a couple of years into a very small startup that has some actual technology built (as opposed to a really simple CRUD app or something).

When I am working on the front-end where things are pretty simple AI is a huge speed up. What it does VERY well it latch on to patterns and then apply those patterns to other things. If it has a couple of examples you can point it to and say "ok build that but over here" the newest revisions of Claude and Gemini are perfectly capable of building the whole thing end to end. Because it's a fairly repetitive task I don't have to spend much time untangling it. I can review it and pattern match against things that don't look right and then dive into those.

For a real example, I needed a page for a user to manually add a vendor in our platform. A simple prompt asking Claude to add a button to the page sent into a mode where it added the button, built the backend handler, added the security checks, defined a form, built another handler to handle the submitted data, and added it to the database. It even wrote the ACL correctly. The errors it introduced were largely around using vanilla HTML in place of our standard components and some small issues with how it attempted to write to the DB using our DB library. This saved me a couple of hours of typing.

Additionally if I need to refactor something AI is a godsend. Just today an underlying query builder completely changed its API and broke..everything. Once I identified how I wanted to handle the changes and wrote some utilities I was able to have Claude just find everything everywhere and make those same changes. It did it with like 90% accuracy. Once again that saved me a couple of hours.

Where it fails, usually spectacularly, is when we get to the stuff that is new or really complex. If it doesn't have patterns to latch onto it tries to invent them itself and the code is garbage. Rarely does it work. Attempting to vibe code it with increasingly more pointed prompts will often result in compiling code but almost never will it do the thing I actually wanted.

In these contexts it's usefulness is mostly things like "write a sql query to do X" which occasionally surfaces a technique I hadn't thought about.

So my experience is pretty mixed. I am definitely saving time. Most of it is typing time not thinking time. Which is like 1/3 of my average day. If I had to guess I am somewhere in the neighborhood of 30-40% faster today than I was in 2019. Notably that speed up has allowed me to really stretch this funding round as we are well past the phase where we would have typically hired people in my past companies. Usually someone relatively mid-level to take over those repetitive tasks.

Instead it's just me and a non-technical founder going along super quickly. We will likely be at a seed round before anyone new comes in.

jsheard

6 days ago

> But all day, every day, a sizable chunk of the front page of HN is allocated to LLMs: incremental model updates, startups doing things with LLMs, LLM tutorials, screeds against LLMs. It’s annoying!

You forgot the screeds against the screeds (like this one)

SKILNER

5 days ago

Let's just imagine we're critiquing cars or planes in about 1910. They're so flawed. No one can say with certainty whether or how they will improve.

Side note of interest, from Wikipedia: "Flying Machines Which Do Not Fly" is an editorial published in the New York Times on October 9, 1903. The article incorrectly predicted it would take one to ten million years for humanity to develop an operating flying machine.

suddenlybananas

5 days ago

Conversely, imagine we're someone talking about colonising the moon in 1969.

pzo

5 days ago

I think we have heated debate because most people don't explain what's their `cost function`. For someone when they talk about AI they take a binary reasoning that if something is not perfect today then it will never be perfect or will never improve. For other is just they see something is useful today and know it will get better next year and don't have expectation of getting AGI.

In your reply it's equivalent of someone expecting AGI in next decade. The same is when people talk about if AI will take software dev jobs. Some just see all the flows in AI and they know they job is secure. Some other see that they are 2x productive and potentially your team mate not needed anymore. If AI can eliminate 50% of IT jobs in ~10-20 years then thats still job replacement. When we replaced horses with cars that doesn't mean we have no horses today or that nobody ride horses.

suddenlybananas

5 days ago

>In your reply it's equivalent of someone expecting AGI in next decade.

This seems an extremely widespread belief among AI boosters.

Dylan16807

5 days ago

If we had funded that as much as we're currently funding AI I think it would have been a plausible goal. Keep in mind we cut more than half of NASA's budget after we first landed.

suddenlybananas

5 days ago

I disagree.

Dylan16807

5 days ago

The gap between spending multiple days somewhere and spending months somewhere isn't that big.

It was only six years to go from the first multi-person spacecraft and first spacewalk to the first space station.

suddenlybananas

5 days ago

>It was only six years to go from the first multi-person spacecraft and first spacewalk to the first space station.

Yeah that's my entire point, technological process doesn't have a constant rate of acceleration. Some advances are quickly made one after another and others lag and take a very long time.

Dylan16807

5 days ago

How long do you think it would have taken to get a permanent moon presence if we kept up Apollo level funding indefinitely with that as the main goal? And since I only said "plausible", let's go with 80th-90th percentile best case scenario.

Even if technological progress stopped we could have launched enough parts to assemble a colony structure.

suddenlybananas

5 days ago

I'm not sure we would have one today. Maybe a token presence but definitely not the moon colony dreams of the 60s.

Dylan16807

5 days ago

A token colony is still a colony. If you meant a specific size I wish you had been more clear upfront.

suddenlybananas

5 days ago

A colony typically refers to a place where people permanently live and are colonising, e.g. setting up state structures, having children, etc.

Dylan16807

5 days ago

Which could be 20 permanent residents and 10 non-permanent slots. I don't think that requires too grand of a structure.

suddenlybananas

5 days ago

This is splitting hairs, but I was envisaging something more like the ISS but on the moon. A far cry from the 50s/60s dream of a grill and white picket fence on the moon.

SKILNER

5 days ago

It's a good idea to think of a converse situation, but this is a bad example. The constraint was not about technology but about budget, perceived benefits and political will.

zahlman

5 days ago

What technologies existing in 1969 might have been used to create an environment with a breathable atmosphere and a self-sustaining ecosystem on the Moon?

What technologies existing today might be used for this purpose, assuming no financial or political limitations?

pier25

5 days ago

I'm mostly skeptical about AI capabilities but I also think it will never be a profitable business. Let's not forget AI companies need to recoup a trillion dollars (so far) just to break even [1].

VCs are already doubting if the billions invested into data centers are going to generate a profit [1 and 2].

AI companies will need to generate profits at some point. Would people still be optimistic about Claude etc if they had to pay say $500 per month to use it given its current capabilities? Probably not.

So far the only company generating real profits out of AI is Nvidia.

[1] https://www.goldmansachs.com/insights/articles/will-the-1-tr...

[2] https://www.nytimes.com/2025/06/02/business/ai-data-centers-...

jillesvangurp

5 days ago

Some of the revenues are very real. A few million subscriptions at tens-hundreds of dollar per month add up to non trivial revenue pretty quickly. I think most software engineers will have such subscriptions fairly soon. We're talking about a market where companies drop 5-20K/month on software engineers and hire whole teams of those. Of course they are going to spend on this. Maybe not 500$. But 20-100$ is a lot less controversial.

And this is quickly spreading beyond software engineering. Software engineers are just being guinea pigs for agentic AIs eventually popping up in all sectors. Basically, while security and quality issues are being sorted out, it helps having users that are a bit more clued in about what they are doing.

That's why AI investments are so hot right now. Of course there are a lot of AI companies that will fall short. There always are. And companies like Nvidia that will make a lot of money selling GPUs.

But there is some non trivial amount of revenue potential there. Anybody still in denial about that is probably penny pinching.

consp

5 days ago

I refuse to pay for any product where I'm the product. Run it locally, maybe I'll pay for it but never online.

My guess would also be that at the 100$ price point only one company can be profitable but that is just a very wild guess.

For what I've seen SE's are some of the last to adopt it. My marketing colleague has been overflowing in generic AI crap produced by external writers for over a year now.

Dylan16807

5 days ago

> I refuse to pay for any product where I'm the product.

Are you "the product" if you're paying $50 but the company also gets $0.35 of value from your data?

If yes I think you're overreacting, if no then I don't think your worries apply to AI subscriptions. (Other worries do, but not that one.)

jillesvangurp

5 days ago

I think this will generally be paid by employers that will pay extra to take care of the usual liability legalese. Which would typically also take care about concerns related to intellectual property, security, etc. This will be about as controversial as office 365, slack and other licenses. And you might reasonably expect some of those subscriptions to include AI features.

Anyway, your employer won't ask you for permission, they'll just expect you to use the tools they provide you with. And the responsible IT manager, VP of engineering, or whomever is just going to look at cost/benefit here and come to some conclusion based on that. My guess is most companies will pay up for some decent subscriptions and will probably get some value out of those.

Likewise freelancers might be asked to not use their personal accounts for security reasons, or they might be expected to get a proper paid subscription that provides the right guarantees in the same way that they might be expected to have a laptop to work on. I've had customers providing me with laptops and I've also done projects with my own laptop. It seems to vary.

pier25

5 days ago

Right now Google, OpenAI, Anthropic, etc are playing a game of chicken. But even after that it's not clear if the remaining players will be able to generate a profit.

Even if AI companies can recoup the billions or trillions of dollars invested, how long will that take and what will be their margin after that?

geraneum

5 days ago

Hence the flurry of articles, controversial interviews and aggressive marketing and the pitch to replace all white color jobs.

concats

5 days ago

What about the free open weights models then? And the open source tooling to go with them?

Sure, they are perhaps 6 months behind the closed-source models, and the hardware to run the biggest and best models isn't really consumer-grade yet (How many years could it be before regular people have GPUs with 200+ gigabytes vram? That's merely one order of magnitude away).

But they're already out there. They will only ever get better. And they will never disappear due to the company going out of business or investors raising prices.

I personally only care about the closed sourced proprietary models in so far as they let me get a glimpse of what I'll soon have access to freely and privately on my own machine. Even if all of them went out of business today, LLMs would still have a permanent effect on our future and how I'd be working.

miningape

5 days ago

How do you plan on running those open weight models then? You need a bunch of very expensive hardware to even slightly "good" performance.

pier25

5 days ago

Will you be buying/renting the hardware needed to run these models with a decent performance?

I can guarantee +99.999% of users won't be doing that.

JoshTriplett

6 days ago

> but the plagiarism

This entire section reads like, oddly, the reverse of the "special pleading" argument that I usually see from artists. Instead of "Oh, it's fine for other fields, but for my field it's a horrible plagiarism machine", it's the reverse: "Oh, it's a problem for those other fields, but for my field get over it, you shouldn't care about copyright anyway".

I'm all for eliminating copyright. The day I can ignore the license on every single piece of proprietary software as I see fit, I'll be all for saying that AIs should be able to do the same. What I will continue to complain about is the asymmetry: individual developers don't get to violate individual licenses, but oh, if we have an AI slurp up millions of codebases and ignore their licenses, that's fine.

No. No, it isn't. If you want to ignore copyright, abolish it for everyone. If it still applies to everyone else, it should still apply to AIs. No special exceptions for mass-scale Open Source license violations.

mwcampbell

6 days ago

I think where tptacek is right, though, is that if we're going to hold this position without hypocrisy, then we need to respect copyright as long as it exists. He's right that many of us have not done that; it's been very common to violate copyright for mere entertainment. If we want the licenses of our own work to be respected, then we need to extend that respect to others as well, regardless of the size of the copyright holder.

Mofpofjis

5 days ago

There are things that "modulate" this. Violating copyright is never right, of course, some questions are however scale, and purpose. Taking others' creative output, unlicensed, for large-scale commercial gain, is about the worst.

Springtime

5 days ago

The whataboutism of that section was odd. The only non-handwavy argument presented is that due to the scale of LLM training that models' output should be treated like US-specific typeface forms' copyright, ie: non-applicable.

It's interesting as typeface plagiarism became rampant beginning in the 70s when more accurate photo reproductions made it trivial. This was problematic for designers wanting to make a livelihood, which is something ITC sought to mitigate by better up-front payments (IIRC from U&lc's coverage) to incentivize quality typeface creation.

There's a distinction though between literal plagiarism and just inspiration from elements. US copyright law doesn't protect either for typeface forms but ironically it does allow copyright for the code used in font files.

I've seen OpenAI's o3-mini (their reasoning model) suggest verbatim code and comments that I found on Github predating LLMs by years. It seems the more times the same code and comment appears online the more likely this is to occur. I'd imagine there would be studies looking into the scope and frequency this occurs and how much is considered fair use.

jsnell

6 days ago

> Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge!

How does this section fit in with the agent section just after? In an agentic model, isn't the merge getting done by either the model or a tool, and the retry-loops on failures would be mostly invisible?

E.g. when using Aider + Gemini Flash 2.5, probably 90% of the changes apply cleanly from my perspective (maybe half actually apply cleanly, the other half after a couple of roundtrips of Aider telling the model that the patch didn't apply). The 10% that only apply partially I usually throw away and redo the prompt, it's really rare that I start merging the code manually.

api

6 days ago

A big problem is that you're either hearing breathless over the top insane hype (or doomerism, which is breathless over the top hype taken to a dark place) or skepticism that considers AI/LLMs to be in the same league as NFTs.

Neither of these is accurate, but I guess nuanced thinking or considering anything below surface vibes is out these days.

So far after playing with them I'm using them as:

1. A junior intern that can google really really fast and has memorized a large chunk of the Internet and the library, and can do rough first-pass research and dig for things.

2. Autocomplete 2.0 that can now generate things like boilerplate or fairly pedestrian unit tests.

3. Rubber duck debugging where the rubber duck talks back.

4. A helper to explain code, at least for a first pass. I can highlight a huge piece of code and ask it to summarize and then explain and walk me through it and it does a passable job. It doesn't get everything right but as long as you know that, it's a good way to break things down and get into it.

For those things it's pretty good, and it's definitely a lot of fun to play with.

I expect that it will get better. I don't expect it to replace programmers for anything but the most boring mindless tasks (the ones I hate doing), but I expect it to continue to become more and more useful as super-autocomplete and all the other things I listed.

dvt

5 days ago

Fully agree with @tptacek here:

> But AI is also incredibly — a word I use advisedly — important. It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.

However, I just don't think the AI coding part is that interesting or future-thinking. We're seeing so much more progress in semantic search, tool calling, general purpose uses, robotics, I mean, DeepMind just won a Nobel for goodness' sake.

Don't get me wrong, I use ChatGPT to write all kinds of annoying boilerplate, and it's not too bad at recalling weird quirks I don't remember (yes, even for Rust). But hard problems? Real problems? Zero shot. Novel problems? No way.

> But I’ve been first responder on an incident and fed 4o — not o4-mini, 4o — log transcripts, and watched it in seconds spot LVM metadata corruption issues on a host we’ve been complaining about for months.

I'm going to go ahead and press (X) to doubt on this anecdote. You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.

Listen, I also think we're on the precipice of re-inventing how we talk to our machines; how we automate tasks; how we find and distribute small nuggets of data. But, imo, coding just ain't it. Donald Knuth calls computer programming an art, and to rob humanity of effecting not just coding—but any art, I'd argue—would be the most cardinal of sins.

Aurornis

5 days ago

> I'm going to go ahead and press (X) to doubt on this anecdote. You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.

I have this problem with a lot of LLM miracle anecdotes. There’s an implication that the LLM did something that was eluding people for months, but when you read more closely they don’t actually say they were working on the problem for months. Just that they were complaining about it for months.

tpmoney

5 days ago

>There’s an implication that the LLM did something that was eluding people for months, but when you read more closely they don’t actually say they were working on the problem for months. Just that they were complaining about it for months.

On the other hand, we've all probably had the experience of putting out a fire and wanting time to track down an issue only to be told to not bother since "everything is working now". Sometimes you spend months complaining about something because the people you're complaining to don't have the time to dive into an issue. Even if it would have taken mere hours for a human to hunt down the issue, someone still has to be given those hours to work on it. By contrast, copying a bunch of logs into an AI is nearly free.

tpmoney

5 days ago

> You've had an issue for months and the logs were somehow so arcane, so dense, so unparseable, no one spotted these "metadata corruption issues?" I'm not going to accuse anyone of blatant fabrication, but this is very hard to swallow.

Eh, I've worked on projects where because of un-revisited logging decisions made in the past, 1-10k error logs PER MINUTE were normal things to see. Finding the root cause of an issue often boiled down to multiple attempts at cleaning up logs to remove noise, cleaning up a tangentially related issues and waiting for it to happen again. More than one root cause was discovered by sheer happenstance of looking at the right subset of the logs at the right moment in time. I can absolutely buy that a system built for parsing large amounts of text and teasing patterns out of that found in minutes what humans could not track down over months.

bergen

5 days ago

One point is missing for me - you get lazy. People are less and less equiped to think about complex problems the more they RELY on this for coding. Also this article is just about coding.

rzodkiew

5 days ago

Not sure why more people aren't mentioning this. But that's the exact reason I've stopped using any LLM stuff. Your thinking just gets lazier and lazier the more you use it. You stop thinking the problems deeply and thoroughly when the "answer" is just around the corner. Then one day you snap out of that mirage, and see what a boatload of junk you've just created with this stuff.

I also dread what's going to happen to the younger generations which have this to replace their real education, but that's a bit offtopic here.

apercu

5 days ago

> My AI skeptic friends are all nuts

My CEO friends that overstate the efficacy of LLMs are all nuts.

(I say this as someone who has been working in "AI" as part of Digital transformation roadmaps for SEVEN years and as someone who has used LLM's daily in my workflow for 3 years).

simonw

5 days ago

I think both of those statements are true at the same time.

leoh

6 days ago

>but it’s bad at rust

I have to say, my ability to learn Rust was massively accelerated via LLMs. I highly recommend them for learning a new skill. I feel I'm roughly at the point (largely sans LLMs) now where I can be nearly as productive in Rust as Python. +1 to RustRover as well, which I strongly prefer to any other IDE.

lycopodiopsida

5 days ago

The interesting question is: is it really that bad at Rust, or does Rust's strict compiler just catch more errors which remain hidden in, let us say, Go? The usual hand-waving response is that developers should write more tests instead - as if a boring and tedious task such as writing tests will not be passed to LLM.

reaperducer

6 days ago

I have to say, my ability to learn Rust was massively accelerated via LLMs.

How would you know?

If you didn't know Rust already, how would you know the LLM was teaching you the right things and the best way to do things?

Just because it compiles doesn't mean it works. The world is full of bad, buggy, insecure, poor code that compiles.

empath75

6 days ago

I've been writing Rust code in production for 4+ years, and I can write Rust pretty well, and I've learned a lot from using chatgpt and co-pilot/cursor.

In particular, it helped me write my first generic functions and macros, two things that were pretty intimidating to try and get into.

leoh

5 days ago

100%. Using an LLM to thoughtfully craft my first useful macro felt like a super-power.

tpmoney

5 days ago

How does anyone self learning know they're learning the "right things and the best way to do things"? By putting the stuff they've learned into practice and putting it up against the real world. How many Rust tutorials are out there that teach things incorrectly, non-idomatically or just inefficiently? How does anyone not already an expert know except by trying it out?

metaltyphoon

6 days ago

No only this, but I would challenge the OP to see if he really knows Rust but turning off LLM and see “how much you truly know”.

This is akin to be on tutorial hell and you “know the language “

leoh

6 days ago

Well, I coded at Google (in addition to other places) for over 10 years without LLMs in several languages and I feel like I’m about at par with Rust as I was with those languages. I’m open to being humbled, which I have felt by LLMs and ofc other folks — “good” is subjective.

empath75

6 days ago

It is not bad at rust. I don't think I could even function well as a Rust programmer without chatgpt and now Cursor. It removes a lot of the burden of remembering how to write generic code and fixing borrow checking stuff. I can just write a generic function with tons of syntax errors and then tell cursor to fix it.

sleepy_keita

6 days ago

Me too -- actually, I'd say that the LLMs I use these days (Sonnet 4 and GPT4.1, o4, etc) are pretty good at rust.

ezst

6 days ago

How much of that proficiency remains once you switch it off?

leoh

6 days ago

Quite a lot, but hey, feel free to put me to the test

kevinsync

6 days ago

Not to derail, but NFT mania (part of the opening salvo in the article) was the giant shitshow that it was -not- because the concept of unique digital bits in the possession of a single owner was a bad idea (or, the concept of unique verification of membership in a club was a bad idea) -- it was a diarrhea-slicked nightmare because it was implemented via blockchains and their related tokens, which inherently peg fluctuating fiat value to the underlying mechanisms of assigning and verifying said ownership or membership, and encourages a reseller's market .. not to mention the perverse, built-in economic incentives required to get nodes to participate in that network to make the whole thing go.

Had NFTs simply been deployed as some kind of protocol that could be leveraged for utility rather than speculation, I think the story would be a complete 180. No clue personally how to achieve that, but it feels like it could be done.. except that, too, would have been completely perverted and abused by centralized behemoths, leading to a different but terrible outcome. Can you imagine if all data became non-fungible? Convince all the big identity vendors (Google, Apple, etc) to issue key pairs to users that then get used by media companies to deliver audio and video keyed only to you that's embedded with maybe some kind of temporal steganographic signature that's hard to strip and can be traced back to your key? It's not just cracking AACS once and copying the bytes. It becomes this giant mess of you literally can't access anything without going through centralized authorities anymore. Then build more anti-patterns on top of that lol. Prolly better that it was mostly just monkey JPEGs and rug pulls.

Anyways, I'm so far off topic from what's actually being discussed -- just couldn't help myself from veering into left field.

thombles

5 days ago

Hmm a bit of a tone shift from: https://fly.io/docs/about/open-source/

> Fly.io builds on the work of a tremendous open source infrastructure community. We want open source authors to benefit from their work because we think a healthy, thriving open source ecosystem will help us build better products.

To the fine article:

> Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass.

sho_hn

5 days ago

Imho, the fly.io people are largely performative HN attention seekers that practice writing posts to produce engagement. It's almost charming in the way it resembles the early-on blogosphere, and you should read them as token summaries of "this is a popular sentiment right now, framed to appear a bit wise and edgy". They're often wordy versions of "that one trick they don't want you to know" or "am I really the only one that [actually quite popular thing]".

In the sense that the posts distill whatever HN at the time would accept as thought leadership it's a good pulse check on the community.

inahga

5 days ago

It's rather shocking for a company to admit to flaunting software licensing in public communications.

But this is not new https://fly.io/blog/wrong-about-gpu/

> At one point, we hex-edited the [NVIDIA] closed-source drivers to trick them into thinking our hypervisor was QEMU.

mattwad

6 days ago

There's a huge caveat i don't see often, which is that it depends on your language for programming. IE. AI is reallllly good at writing Next.js/Typescript apps, but not so much Ruby on Rails. YMMV

el_memorioso

6 days ago

I agree with this. People who are writing Python, Javascript, or Typescript tell me that they get great results. I've had good results using LLMs to flesh out complex SQL queries, but when I write Elixir code, what I get out of the LLM often doesn't even compile even when given function and type specs in the prompt. As the writer says, maybe I should be using an agent, but I'd rather understand the limits of the lower-level tools before adding other layers that I may not have access to.

dgb23

6 days ago

My hunch is that to exploit LLMs one should lean on data driven code more. LLMs seem to have a very easy time to generate data literals. Then it's far less of an issue to write in a niche language.

Not familiar with Elixir but I assume it's really good at expressing data driven code, since it's functional and has pattern matching.

skydhash

5 days ago

I think for some languages like Clojure and Elixir, it's just so easy to get to the level of abstraction you need to write your business logic that everyone does so. So the code does not have any commonalty with each other. Even when using the same framework/library.

But for Python, JS, etc,... it's the same down to earth abstraction that everyone is dealing with, like the same open a file, parse a csv, connect to the database patterns.

bob1029

6 days ago

I am finding the most destructive aspect of LLM assistance to be the loss of flow state.

Most of the time I can go faster than these tools if I have confidence in myself and allow the momentum to build up over the course of 20-30 minutes. Every time I tab out to an LLM is like a 5 minute penalty over what I could have done unaided on a good day.

Getting the model prepared to help you in a realistic domain often takes a few minutes of arranging code & comments so that it is forced toward something remotely sane. I'll scaffold out entire BS type hierarchies just so I can throw a //TODO: .... line in the middle somewhere. Without this kind of structure, I would be handling unfiltered garbage most of the time.

It's not that these tools are bad, it's that we need to recognize the true cost of engaging with them. ChatGPT is like a jackhammer. It will absolutely get you through that concrete slab. However, it tends to be quite obnoxious & distracting in terms of its operational principles.

Mofpofjis

5 days ago

> I am finding the most destructive aspect of LLM assistance to be the loss of flow state.

Finally someone else mentions Flow!

samcat116

5 days ago

I find the opposite, I can stay in flow state much longer than I could before because tabbing out to an LLM is a tighter loop than me going out to google and searching through sites for smaller tidbits that I have to assemble and get through, rather than a full solution that might not be 100% correct, but most of the time I can get to the working solution quicker.

paradox242

5 days ago

I like Thomas, but I find his arguments include the same fundamental mistake I see made elsewhere. He acknowledged that the tools need an expert to use properly, and as he illustrated, he refined his expertise over many years. He is of the first and last generation of experienced programmers who learned without LLM assistance. How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase? I can almost anticipate an interjection along the lines of "well we used to build everything with our hands and now we have tools etc, it's just different" but this is an order of magnitude different. This is asking a robot to design and assemble a shed for you, and you never even see the saw, nails, and hammer being used, let alone understand enough about how the different materials interact to get much more than a "vibe" for how much weight the roof might support.

hatthew

5 days ago

I think the main difference between shortcuts like "compilers" and shortcuts like "LLMs" is determinism. I don't need to know assembly because I use a compiler that is very well specified, often mathematically proven to introduce no errors, and errs on the side of caution unless specifically told otherwise.

On the other hand, LLMs are highly nondeterministic. They often produce correct output for simple things, but that's because those things are simple enough that we trust the probability of it being incorrect is implausibly low. But there's no guarantee that they won't get them wrong. For more complicated things, LLMs are terrible and need very well specified guardrails. They will bounce around inside those guardrails until they make something correct, but that's more of a happy accident than a mathematical guarantee.

LLMs aren't a level of abstraction, they are an independent entity. They're the equivalent of a junior coder who has no long term memory and thus needs to write everything down and you just have to hope that they don't forget to write something down and hope that some deterministic automated test will catch them if they do forget.

If you could hire an unpaid intern with long term memory loss, would you?

chrismorgan

5 days ago

Determinism is only one part of it: predictability and the ability to model what it’s doing is perhaps more important.

The physics engine in the game Trackmania is deterministic: this means that you can replay the same inputs and get the same output; but it doesn’t mean the output always makes sense: if you drive into a wall in a particular way, you can trigger what’s called an uberbug, where your car gets flung in a somewhat random direction at implausibly high speed. (This sort of thing can lead to fun tool-assisted speedruns that are utterly unviable for humans.)

The abstractions part you mention, there’s the key. Good abstractions make predictable. Turn the steering wheel to the left, head left. There are still odd occasions when I will mispredict what some code in a language like Rust, Python or JavaScript will do, but they’re rare. By contrast, LLMs are very unpredictable, and you will fundamentally never be able to mentally model what it achieves.

hatthew

5 days ago

Having an LLM code for you is like watching someone make a TAS. It technically meets the explicitly-specified goals of the mapper (checkpoints and finish), but the final run usually ignores the intended route made by the mapper. Even if the mapper keeps on putting in extra checkpoints and guardrails in between, the TAS can still find a crazy uberbug into backflip into taking half the checkpoints in reverse order. And unless you spend far longer studying the TAS than it would have taken to learn to drive it yourself normally, you're not going to learn much yourself.

adamddev1

5 days ago

Exactly. Compilers etc. are like well-proven algebraic properties, you can build on them and reason with them and do higher level math with confidence. That's a very different type of "advancement" than what we're seeing with LLMs.

red75prime

5 days ago

> If you could hire an unpaid intern with long term memory loss, would you?

It's clearly a deficiency. And that's why one of the next generations of AIs will have long term memory and online learning. Although even the current generation of the models shows signs of self-correction that somewhat mitigate the "random walk" you've mentioned.

hatthew

5 days ago

ok, let me know when that happens

red75prime

5 days ago

Don't worry. It would be hard to miss.

But, seriously, while it's not an easy task (otherwise we'd have seen it done already), it doesn't seem to be a kind of task that requires a paradigm shift or some fundamental discovery. It's a search in the space of network architectures.

Of course, we are talking about something that hasn't been invented yet, so I can't provide ironclad arguments like with, say, fusion power where we know what to do and it's just hard to do.

There is circumstantial evidence though. Complex problem solving skills that evolved in different groups of species: homo, corvidae, octopoda. Which points at either existence of multiple solutions to the problem of intelligence or at not that high complexity of a solution.

Anyway, with all the resources that are put into the development of AI will see the results (one way or another) soon enough. If long-term memory is not incorporated into AIs in about 5 years, then I'm wrong and it's indeed likely to be a fundamental problem with the current approach.

haskellandchill

5 days ago

It's not just one unpaid intern with long term memory loss, it's several of them. And they don't need breaks.

If you could hire an army of unpaid interns with long term memory loss who work 24/7, would you?

hatthew

5 days ago

Hell no, any experienced engineer would rather do it themselves than attempt to corral an untrained army. Infinite monkeys can write a sequel to shakespeare, but it's faster to write it myself than to sift through mountains of gibberish on a barely-domesticated goose chase.

d_watt

5 days ago

What do you think the "mistake" is here?

It seems like you're pointing out a consequence, not a counter argument.

travisjungroth

5 days ago

There’s a really common cognitive fallacy of “the consequences of that are something I don’t like, therefore it’s wrong”.

It’s like reductio ad absurdum, but without the logical consequence of the argument being incorrect, just bad.

You see it all the time, especially when it comes to predictions. The whole point of this article is coding agents are powerful and the arguments against this are generally weak and ill-informed. Coding agents having a negative impact on skill growth of new developers isn’t a “fundamental mistake” at all.

efitz

5 days ago

Exactly.

What I’ve been saying to my friends for the last couple of months has been, that we’re not going to see coding jobs go away, but we’re going to run into a situation where it’s harder to grow junior engineers into senior engineers because the LLMs will be doing all the work of figuring out why it isn’t working.

This will IMO lead to a “COBOL problem” where there are a shortage of people with truly deep understanding of how it all fits together and who can figure out the line of code to tweak to fix that ops problem that’s causing your production outage.

I’m not arguing for or against LLMs, just trying to look down the road to consequences. Agentic coding is going to become a daily part of every developer’s workflow; by next year it will be table stakes - as the article said, if you’re not already doing it, you’re standing still: if you’re a 10x developer now, you’ll be a 0.8x developer next year, and if you’re a 1x developer now, without agentic coding you’ll be a 0.1x developer.

It’s not hype; it’s just recognition of the dramatic increase in productivity that is happening right now.

jjfoooo4

5 days ago

> How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?

LLM's are so-so coders but incredible teachers. Today's students get the benefit of asking copying and pasting a piece of code into an LLM and asking, "How does this work?"

There's a lot of young people that will use LLM's to be lazy. There's also a lot that will use them to feed their intellectual curiosity.

kweingar

5 days ago

Many of the curious ones will be adversely affected.

When you're a college student, the stakes feel so high. You have to pass this class or else you'll have to delay graduation and spend thousands of dollars. You have to get this grade or else you lose your grant or scholarship. You want to absorb knowledge from this project (honestly! you really do) but you really need to spend that time studying for a different class's exam.

"I'm not lazy, I'm just overwhelmed!" says the student, and they're not wrong. But it's very easy for "I'm gonna slog through this project" to become "I'm gonna give it a try, then use AI to check my answer" and then "I'm gonna automate the tedious bits that aren't that valuable anyway" and then "Well I'll ask ChatGPT and then read its answer thoroughly and make sure I understand it" and then "I'll copy/paste the output but I get the general idea of what it's doing."

johnnyanmac

5 days ago

Is that what students will do, though? Or will they see the cynical pump and dump and take the shortcuts to get the piece of paper and pass the humiliation ritual of the interview process?

jacob019

5 days ago

I'm hearing this fear more frequently, but I do not understand it. Curriculum will adapt. We are a curious and intelligent species. There will be more laypeople building things that used to require deep expertise. A lot of those things will be garbage. Specialists will remain valuable and in demand. The kids will still learn to write loops, use variables, about OOP and functional programming, how to write "hello world," to add styles, to accept input, etc. And they'll probably ask a model for help when they get stuck, and the teacher won't let them use that during a test. The models will be used in many ways, and for many things, but not all things; it will be normal and fine. Developing will be more productive and more fun, with less toil.

protocolture

5 days ago

>How is someone just coming out of school going to get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?

Dunno. Money is probably going to be a huge incentive.

I see the same argument everywhere. Like animators getting their start tweening other peoples content. AI is great at tweening and likely to replace farms of juniors. But companies will need seniors to direct animation, so they will either have to pay a lot of money to find them or pay a lot of money to train them.

hippari2

5 days ago

Well this is actually happening in Japanese Animation and the result is that no young talents are getting trained in the workforce. [1]

But unlike animation, where the demand for the art can just disappear. I don't think the demand for software engineer will disappear. Same thing with musician. Young engineers might just be jobless or on training mode for much longer period of time before they can make actual living money.

Good thing is, as far as I know, Kyoto Animation managed to avoid this issue by having in-house training, growing their own talent pools.

[1]: https://blog.sakugabooru.com/2023/03/31/the-long-quest-to-fi...

p1necone

5 days ago

Expecting commercial entities to engage in long term thinking when they can not do that and reduce costs in the next financial quarter is a fools game.

I think what you've said is largely true, but not without a long period of mess in between.

protocolture

5 days ago

Back in the day I found significant career advancement because something that I haven't been able to identify (lack of on the job training i believe) had removed all the mid level IT talent in my local market. For a while I was able to ask for whatever I wanted because there just was not anyone else available. I had a week where a recruitment agency had an extremely attractive young woman escort me around a tech conference, buying me drinks and dinner and then refer me out to a bespoke MSP for a job interview (which I turned down which is funny) The market did eventually respond but it benefited me greatly. I imagine, this is what it will be like for a decade or so as a trained senior animator. No competition coming up, and plenty of money to be made. Until businesses sort their shit out, which like you say will happen eventually.

thewebguyd

5 days ago

> get the encouragement and space to independently develop the experience they need to break out of the "vibe coding" phase?

I wonder this too, as someone who is entirely self-taught, when I started escaping “tutorial hell” was the hardest part of the journey, and took quite a bit of both encouragement and sheer willpower. Not sure I would have ever went beyond that if I had LLMs.

I worry for Juniors, and either we’ll need to find a way to mentor them past the vibe coding phase, or we hope that AI gets good enough before we all retire.

2muchcoffeeman

5 days ago

There will always be people that manage to get into the guts of something.

All AI is going to do is create a new class of programmer, such that the people who know the details will end up being more valuable.

mdnahas

5 days ago

I wonder if that will make the great generation of human coders. Some of our best writers were the generation that spanned between oral education and mass production of books. Late generations read and wrote, rather than memorized and spoke. I think that was Shakespeare’s genius. Maybe our best coders will be supercharged with AI, and subsequent ones enfeabled by it.

Shakespeare was also popular because he was published as books became popular. Others copied him.

SpicyLemonZest

5 days ago

I suppose the counterargument is, how many experienced programmers today have seen a register or a JMP instruction being used?

MobiusHorizons

5 days ago

Quite a lot of the good programmers I have worked with may never have needed to write assembly, but are also not at all confused or daunted by it. They are curious about their abstractions, and have a strong grasp of what is going on beneath the curtain even if they don't have to lift it all that often.

Most of the people I work with, however, just understand the framework they are writing and display very little understanding or even curiosity as to what is going on beneath the first layer of abstraction. Typically this leaves them high and dry when debugging errors.

Anecdotally I see a lot more people with a shallow expertise believing the AI hype.

eslaught

5 days ago

The difference is that the abstraction provided by compilers is much more robust. Not perfect: sometimes programmers legitimately need to drop into assembly to do various things. But those instances have been rare for decades and to a first approximation do not exist for the vast majority of enterprise code.

If AI gets to that level we will indeed have a sea change. But I think the current models, at least as far as I've seen, leave open to question whether they'll ever get there or not.

haiku2077

5 days ago

It's pretty common for CS programs to include at least one course with assembly programming. I did a whole class programming controllers in MIPS.

metaltyphoon

5 days ago

I would assume at least the ones that did a formal CS degree would know JMP exists.

weikju

5 days ago

Your compiler does not hallucinate registers or JMP instructions

zdragnar

5 days ago

Doesn't it? Many compilers offer all sorts of novel optimizations for operations that end up producing the same result with entirely different runtime characteristics than the source code would imply. Going further, turn on gcc fast math and your code with no undefined behavior suddenly has undefined behavior.

I'm not much of a user of LLMs for generating code myself, but this particular analogy isn't a great fit. The one redeeming quality is that compiler output is deterministic or at least repeatable, whereas LLMs have some randomness thrown in intentionally.

With that said, both can give you unexpected behavior, just in different ways.

skydhash

5 days ago

> With that said, both can give you unexpected behavior, just in different ways.

Unexpected as in "I didn't know" is different than unexpected as in "I can't predict". GCC optimizations is in the former camp and if you care to know, you just need to do a deep dive in your CPU architecture and the gcc docs and codebase. LLMs is a true shot in the dark with a high chance miss and a slightly lower chance of friendly fire.

Dylan16807

5 days ago

And every single developer is supposed to memorize GCC's optimizations so they never make a change that will optimize wrong?

Nah, treating undefined behavior as predictable is a fool's errand. It's also a shot in the dark.

skydhash

5 days ago

What is that about memorization? I just need to know where the information is so I can refer to it later when I need it (and possibly archive them if they're that important).

Dylan16807

5 days ago

If you're not trying to memorize the entirety of GCC's behavior (and keeping up with its updates), then you need to check if your UB is still doing what you expect every single time you change your function. Or other functions near it. Or anything that gets linked to it.

It's effectively impossible to rely on. Checking at the time of coding, or occasionally spot checking, still leaves you at massive risk of bugs or security flaws. It falls under "I can't predict".

skydhash

5 days ago

In C, strings are just basic arrays which themselves are just pointers. There’s no safeguards there like we have in Java, so we need to write the guardrails ourselves, because failure to do so result in errors. If you didn’t know about it, a buffer overflow may be unexpected, but you don’t need to go and memorize the entire gcc codebase to know. Just knowing the semantics is enough.

The same thing happens with optimization. They usually warns about the gotchas, and it’s easy to check if the errors will bother you or not. You dont have to do an exhaustive list of errors when the classes are neatly defined.

Dylan16807

5 days ago

This comment seems to mostly be describing avoiding undefined behavior. You can learn the rules to do that, though it's very hard to completely avoid UB mistakes.

But I'm talking about code that has undefined behavior. If there is any at all, you can't reliably learn what optimizations will happen or not, what will break your code or not. And you can't check for incorrect optimization in any meaningful way, because the result can change at any point in the future for the tiniest of reasons. You can try to avoid this situation, but again it's very hard to write code that has exactly zero undefined behavior.

When you talked about doing "a deep dive in your CPU architecture and the gcc docs and codebase", that is only necessary if you do have undefined behavior and you're trying to figure out what actually happens. But it's a waste of effort here. If you don't have UB, you don't need to do that. If you do have UB it's not enough, not nearly enough. It's useful for debugging but it won't predict whether your code is safe into the future.

To put it another way, if we're looking at optimizations listing gotchas, when there's UB it's like half the optimizations in the entire compiler are annotated with "this could break badly if there's UB". You can't predict it.

MobiusHorizons

5 days ago

I suppose you are talking about UB? I don't think that is anything like Halucination. It's just tradeoffs being made (speed vs specified instructions) with more ambiguity (UB) than one might want. fast math is basically the same idea. You should probably never turn on fast math unless you are willing to trade speed for accuracy and accept a bunch of new UB that your libraries may never have been designed for. It's not like the compiler is making up new instructions that the hardware doesn't support or claiming the behavior of an instruction is different than documented. If it ever did anything like that, it would be a bug, and would be fixed.

Dylan16807

5 days ago

> or claiming the behavior of an instruction is different than documented

When talking about undefined behavior, the only documentation is a shrug emoticon. If you want a working program, arbitrary undocumented behavior is just as bad as incorrectly documented behavior.

MobiusHorizons

5 days ago

UB is not undocumented. It is documented to not be defined. In fact any given hardware reacts deterministically in the majority of UB cases, but compilers are allowed to assume UB was not possible for the purposes of optimization.

Dylan16807

5 days ago

The trigger for UB is documented, the result is not documented.

And despite the triggers being documented, they're very very hard to avoid completely.

osti

5 days ago

I bet they did at one point in time, then they stopped doing that, but still not bug free.

weikju

a day ago

lol are you serious? I bet compilers are less deterministic now than before what with all the CPUs and their speculative executions and who knows what else. But all that stuff is still documented and not made out of thin air randomly…

JKCalhoun

5 days ago

Agree. We'll get a new breed of programmer — not shitty ones — just different. And I am quite sure, at some point in their career, they'll drop down to some lower level and try to do things manually.... Or step through the code and figure out a clever way to tighten it up....

Or if I'm wrong about the last bit, maybe it never was important.

bodge5000

5 days ago

Counter-counterargument; You don't need to understand metalworking to use a hammer or nails, that's a different trade, though an important trade that someone else does need to understand in order for you to do your job.

If all of mankind lost all understanding of registers overnight, it'd still affect modern programming (eventually)

kfajdsl

5 days ago

Anyone that's gotten a CS degree or looked at godbolt output.

prmph

5 days ago

Not really a counter-argument.

The abstraction over assembly language is solid; compilers very rarely (if at all) fail to translate high level code into the correct assembly code.

LLMs are nowhere near the level where you can have almost 100% assurance that they do what you want and expect, even with a lot of hand-holding. They are not even a leaky abstraction; they are an "abstraction" with gaping holes.

the__alchemist

5 days ago

Registers: All the time for embedded. JMP instruction? No idea what that is!

harpiaharpyja

5 days ago

Probably more than you might think.

As a teen I used to play around with Core Wars, and my high school taught 8086 assembly. I think I got a decent grasp of it, enough to implement quicksort in 8086 while sitting through a very boring class, and test it in the simulator later.

I mean, probably few people ever need to use it for something serious, but that doesn't mean they don't understand it.

Aeolun

5 days ago

Feels like coding with and without autocomplete to me. At some point you are still going to need to understand what you are doing, even if your IDE gives you hints about what all the functions do.

Sure, it's a different level, but it's still more or less the same thing. I don't think you can expect to learn how to code by only ever using LLMs, just like you can't learn how to code by only ever using intellisense.

Aurornis

5 days ago

> I like Thomas, but I find his arguments include the same fundamental mistake I see made elsewhere

Some of the arguments in the article are so bizarre that I can’t believe they’re anything other than engagement bait.

Claiming that IP rights shouldn’t matter because some developers pirate TV shows? Blaming LLM hallucinations on the programming language?

I agree with the general sentiment of the article, but it feels like the author decided to go full ragebait/engagement bait mode with the article instead of trying to have a real discussion. It’s weird to see this language on a company blog.

I think he knows that he’s ignoring the more complex and nuanced debates about LLMs because that’s not what the article is about. It’s written in inflammatory style that sets up straw man talking points and then sort of knocks them down while giving weird excuses for why certain arguments should be ignored.

tptacek

5 days ago

They are not engagement bait. That argument, in particular, survived multiple rounds of reviews with friends outside my team who do not fully agree with me about this stuff. It's a deeply sincere, and, I would say for myself, earned take on this.

A lot of people are misunderstanding the goal of the post, which is not necessarily to persuade them, but rather to disrupt a static, unproductive equilibrium of uninformed arguments about how this stuff works. The commentary I've read today has to my mind vindicated that premise.

lolinder

5 days ago

> That argument, in particular, survived multiple rounds of reviews with friends outside my team who do not fully agree with me about this stuff. It's a deeply sincere, and, I would say for myself, earned take on this.

Which argument? The one dismissing all arguments about IP on the grounds that some software engineers are pirates?

That argument is not only unpersuasive, it does a disservice to the rest of the post and weakens its contribution by making you as the author come off as willfully inflammatory and intentionally blind to nuance, which does the opposite of breaking the unproductive equilibrium. It feeds the sense that those in the skeptics camp have that AI adopters are intellectually unserious.

I know that you know that the law and ethics of IP are complicated, that the "profession" is diverse and can't be lumped into a cohesive unit for summary dismissal, and that there are entirely coherent ethical stances that would call for both piracy in some circumstances and condemnation of IP theft in others. I've seen enough of your work to know that dismissing all that nuance with a flippant call to "shove this concern up your ass" is beneath you.

simoncion

5 days ago

> The one dismissing all arguments about IP on the grounds that some software engineers are pirates?

Yeah... this was a really, incredibly horseshit argument. I'm all for a good rant, but goddamn, man, this one wasn't good. I would say "I hope the reputational damage was worth whatever he got out of it", but I figure he's been able to retire at any time for a while now, so that sort of stuff just doesn't matter anymore to him.

tptacek

5 days ago

I love how many people have in response to this article tried to intimate that writing it put my career in jeopardy; so forcefully do they disagree with a technical piece that it must somehow be career-limiting.

simoncion

5 days ago

It's just such a mind-meltingly bad argument, man.

"A whole bunch of folks ignore copyright terms, so all complaints that 'Inhaling most-to-all of the code that can be read on the Internet with the intent to make a proprietary machine that makes a ton of revenue for the owner of that machine and noone else is probably bad, and if not a violation of the letter of the law, surely a violation of its spirit.' are invalid."

When I hear someone sincerely say stuff that works out to "Software licenses don't matter, actually.", I strongly reduce my estimation of their ability to reason well and behave ethically. Does this matter? Probably not. There are many folks in the field who hold that sort of opinion, so it's relatively easy to surround yourself with likeminded folks. Do you hold these sorts of opinions? Fuck if I know. All I know about is what you wrote today.

Anyway. As I mentioned, you're late-career in what seems to be a significantly successful career, so your reputation absolutely doesn't matter, and all this chatter is irrelevant to you.

tptacek

5 days ago

I don't know who you're quoting, but it's not me.

simoncion

5 days ago

I'm not quoting anyone. Perhaps wrapping the second paragraph in what I understand to be Russian-style quotes (« ») would have been clearer? Or maybe prepending the words "Your argument ends up being something like " to the second paragraph would have been far clearer? shrug

tptacek

5 days ago

On HN, the convention is that quotations indicate literal quotation. It's not a reasonable paraphrase of my argument either, but you know that.

simoncion

5 days ago

> It's not a reasonable paraphrase of my argument either, but you know that.

To quote from your essay:

"But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show."

Man, you might not see the resemblance now, but if you return to it in three to six months, I bet you will.

Also, I was a professional musician in a former life. Given the content of your essay, you might be surprised to know how very, very fast and loose musicians as a class play with copyright laws. In my personal experience, the typical musician paid for approximately zero of the audio recordings in their possession. I'd be surprised if things weren't similar for the typical practitioner of the visual arts.

meowface

5 days ago

I agree this is a bad "collective punishment" argument from him, even if I think he's somewhat right in spirit because I as a software dev don't care in the slightest about LLMs training on code, text, videos, or images and fully believe it's equivalent to humans perceiving and learning from the output of others and I know many or most software devs agree on that point while most artists don't.

I think artists are very cavalier about IP, on average. Many draw characters from franchises that do not allow such drawing, and often directly profit by selling those images. Do I think that's bad? No. (Unless it's copying the original drawing plagiaristically.) Is it odd that most of the people who profit in this way consider generative AI unethical copyright infringement? I think so.

I think hypocrisy on the issue is annoying. Either you think it's cool for LLMs to learn from code and text and images and videos or you don't think any of it is fine. tptacek should bite one bullet or the other.

tptacek

5 days ago

I don't accept the premise that "training on" and "copying" are the same thing, any more than me reading a book and learning stuff is copying from the book. But past that, I have, for the reasons stated in the piece, absolutely no patience for software developers trying to put this concern on the table. From my perspective, they've forfeited it.

lolinder

4 days ago

Who is this "they" who have forfeited it?

What about the millions of software developers who have never even visited a pirate site, much less built one?

Are we including the Netflix developers working actively on DRM?

How about the software developers working on anti-circumvention code for Kindle?

I'm totally perplexed at how willing you are to lump a profession of more than 20 million people all into one bucket and deny all of them, collectively, the right to say anything about IP. Are doctors not allowed to talk about the society harms of elective plastic surgery because some of them are plastic surgeons? Is anyone with an MBA not allowed to warn people against scummy business practices because many-to-most of them are involved in dreaming those practices up?

This logic makes no sense, and I have to imagine that you see that given that you're avoiding replying to me.

tptacek

4 days ago

You can say whatever you'd like about IP. You just don't get to tell me how to hear it.

lolinder

4 days ago

I mean, that's fine I guess? As long as you're aware that you're being totally and utterly irrational about it.

simoncion

a day ago

> I don't accept the premise that "training on" and "copying" are the same thing...

Nor do I. Training and copying are clearly different things... and if these tools had never emitted -verbatim- nontrivial chunks of the code they'd ingested, [0] I'd be much less concerned about them. But as it stands now, some-to-many of the companies that build and deploy these machines clearly didn't care to ensure that their machines simply wouldn't plagiarize.

I've a bit more commentary that's related to whether or not what these companies are doing should be permitted here. [1]

[0] Based on what I've seen, when it happens, it is often with either incorrect copyright and/or license notifications, or none of the verbiage the license of the copied code requires in non-trivial reproductions of that code.

[1] <https://news.ycombinator.com/item?id=44166983>

tptacek

5 days ago

I come from a family musicians (I'm the only non-musician in it).

lolinder

4 days ago

And I'm the only one in mine who isn't either a musician or an author. I'm not sure why you believe that being in a creative family gives you some sort of divine authority to condemn the rest of us for our collective sins.

simoncion

a day ago

Ah good. If one of your family were to bring a plagiarism suit against another musician (or company (regardless of whether that company's music was produced by humans or robots)) that'd clearly ripped off their work, would you decry them as a hypocrite?

If not, why not?

If so, (seriously, earnestly) kudos for being consistent in your thoughts on the matter.

lolinder

5 days ago

The second paragraph in OP's comment is absolutely a reasonable paraphrase of your argument. I read your version many times over to try to find the substance and... that is the substance. If you didn't mean it to be then that section needed to be heavily edited.

emilyst

5 days ago

What really resonated with me was your repeated calls for us at least to be arguing about the same thing, to get on the same page.

Everything about LLMs and generative AI is getting so mushed up by people pulling it in several directions at once, marketing clouding the water, and the massive hyperbole on both sides, it's nearly impossible to understand if we're even talking about the same thing!

cyral

5 days ago

It's a good post and I strongly agree with the part about level setting. You see the same tired arguments basically every day here and subreddits like /r/ExperiencedDevs. I read a few today and my favorites are:

- It cannot write tests because it doesn't understand intent

- Actually it can write them, but they are "worthless"

- It's just predicting the next token, so it has no way of writing code well

- It tries to guess what code means and will be wrong

- It can't write anything novel because it can only write things it's seen

- It's faster to do all of the above by hand

I'm not sure if it's the issue where they tried copilot with gpt 3.5 or something, but anyone who uses cursor daily knows all of the above is false, I make it do these things every day and it works great. There was another comment I saw here or on reddit about how everyone needs to spend a day with cursor and get good at understanding how prompting + context works. That is a big ask but I think the savings are worth it when you get the hang of it.

tptacek

5 days ago

Yes. It's this "next token" stuff that is a total tell we're not all having the same conversation, because what serious LLM-driven developers are doing differently today than they were a year ago has not much at all to do with the evolution of the SOTA models themselves. If you get what's going on, the "next token" thing has nothing at all to do with this. It's not about the model, it's about the agent.

phkahler

5 days ago

>> Blaming LLM hallucinations on the programming language?

My favorite was suggesting that people select the programming language based of which ones LLMs are best at. People who need an LLM to write code might do that, but no experienced developer would. There are too many other legitimate considerations.

mediaman

5 days ago

If an LLM improves coding productivity, and it is better at one language than another, then at the margin it will affect which language you may choose.

At the margin means that both languages, or frameworks or whatever, are reasonably appropriate for the task at hand. If you are writing firmware for a robot, then the LLM will be less helpful, and a language such as Python or JS which the LLM is good at is useless.

But Thomas's point is that arguing that LLMs are not useful for all languages is not the same as saying that are not useful for any language.

If you believe that LLM competencies are not actually becoming drivers in what web frameworks people are using, for example, you need to open your eyes and recognize what is happening instead of what you think should be happening.

(I write this as someone who prefers SvelteJS over React - but LLM's React output is much better. This has become kind of an issue over the last few years.)

rapind

5 days ago

I'm a little (not a lot) concerned that this will accelerate the adoption of languages and frameworks based on their popularity and bury away interesting new abstractions and approaches from unknown languages and frameworks.

Taking your react example, then if we we're a couple years ahead on LLMs, jQuery might now be the preferred tool due to AI adoption through consumption.

You can apply this to other fields too. It's quite possible that AIs will make movies, but the only reliably well produced ones will be superhero movies... (I'm exaggerating for effect)

Could AI be the next Cavendish banana? I'm probably being a bit silly though...

simoncion

5 days ago

> I'm a little ... concerned that this will accelerate the adoption of languages and frameworks based on their popularity and bury away interesting new abstractions and approaches...

I'd argue that the Web development world has been choosing tooling based largely on popularity for like at least a decade now. I can't see how tooling selection could possibly get any worse for that section of the profession.

rapind

5 days ago

I disagree. There’s a ton of diversity in web development currently. I don’t think there’s ever been so many language and framework choices to build a web app.

The argument is that we lose this diversity as more people rely on AI and choose what AI prefers.

jhatemyjob

5 days ago

You raise a valid concern but you presume that we will stay under the OpenAI/Anthropic/etc oligopoly forever. I don't think this is going to be the status quo in the long-term. There is demand for different types of LLMs trained on different data. And there is demand for hardware. For example the new Mac Studio has 512gb VRAM which can run the 600B param Deepseek model locally. So in the future I could see people training their own LLMs to be experts at their language/framework of choice.

Of course you could disagree with my prediction and that these big tech companies are going to build MASSIVE gpu farms the size of the Tesla Gigafactory which can run godlike AI where nobody can compete, but if we get to that point I feel like we will have bigger problems than "AI react code is better than AI solidjs code"

rapind

4 days ago

I suspect we’ll plateau at some point and the gigafactories won’t produce a massive advantage. So running your own models could very well be a thing.

jhatemyjob

4 days ago

Yea probably..... I wonder when the plateau is. Is it right around the corner or 10 years from now? Seems like they can just keep growing it forever, based on what Sam Altman is saying. I'm botching the quote but either he or George Hotz said something to the effect of: every time you add an order of magnitude to the size of the data, there is a noticeable qualitative difference in the output. But maybe past a certain size you get diminishing returns. Or maybe it's like Moore's Law where they thought it would just go on forever but it turned out it's extremely difficult to get the distance between two transistors smaller than 7nm

mwarkentin

4 days ago

Yes, some think it’s happening now: https://www.aisnakeoil.com/p/is-ai-progress-slowing-down

svachalek

5 days ago

In the relatively near future this is going to be like arguing what CPU to buy based on how you like the assembly code. Human readability is going to matter less and less and eventually we will likely standardize on what the LLMs work with best.

simoncion

a day ago

> There’s a ton of diversity in web development currently.

You misunderstand me. It's not incompatible for a culture to choose options based largely on popularity (rather than other properties that one would expect to be more salient when making a highly-technical choice), and for there to also be many options to choose from.

steveklabnik

5 days ago

People make productivity arguments for using various languages all the time. Let's use an example near and dear to my heart: "Rust is not as productive as X, therefore, you should use X unless you must use Rust." If using LLMs makes Rust more productive than X, that changes this equation.

Feel free to substitute Y instead of Rust if you want, just I know that many people argue Rust is hard to use, so I feel the concreteness is a good place to start.

SoftTalker

5 days ago

Maybe they don’t today, or up until recently, but I’d believe it will be a consideration for new projects.

Is certainly true that at least some projects choose languages based on or at least influenced by how easy it is to hire developers fluent in that language.

PeterHolzwarth

5 days ago

I see no straw men in his arguments: what i see are pretty much daily direct quotes pasted in from HN comments.

Aurornis

4 days ago

> daily direct quotes pasted in from HN comments.

That’s literally the strawman.

dasil003

5 days ago

I am squarely in the bucket of AI skeptic—an old-school, code-craftsman type of personality, exactly the type of persona this article is framed again, and yet my read is nothing like yours. I believe he's hitting these talking points to be comprehensive, but with nothing approaching the importance and weightiness you are implying. For example:

> Claiming that IP rights shouldn’t matter because some developers pirate TV shows?

I didn't see him claiming that IP rights shouldn't matter, but rather that IP rights don't matter in the face of this type of progress, they never have since the industrial revolution. It's hypocritical (and ultimately ineffectual) for software people to get up on a high horse about that now just to protect their own jobs.

And lest you think he is an amoral capitalist, note the opening statement of the section: "Artificial intelligence is profoundly — and probably unfairly — threatening to visual artists in ways that might be hard to appreciate if you don’t work in the arts.", indicating that he does understand and empathize with the most material of harms that the AI revolution is bringing. Software engineers aren't on that same spectrum because the vast majority of programming is not artisinal creative work, it's about precise automation of something as cheaply as possible.

Or this one:

> Blaming LLM hallucinations on the programming language?

Was he "blaming"? Or was he just pointing out that LLMs are better at some languages than others? He even says:

> People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough!

Which seems very truthy and in no way is blaming LLMs. Your interpretation is taking a some kind of logical / ethical leap that is not present in the text (as far as I can tell).

simoncion

5 days ago

> Software engineers aren't on that same spectrum because the vast majority of programming is not artisinal creative work...

That's irrelevant. Copyright and software licensing terms are still enforced in the US. Unless the software license permits it, or it's for one of a few protected activities, verbatim reproduction of nontrivial parts of source code is not legal.

Whether the inhalation of much (most? nearly all?) of the source code available on the Internet for the purpose of making a series of programming machines that bring in lots and lots of revenue for the companies that own those machines is either fair use or it's infringing commercial use has yet to be determined. Scale is important when determining whether or not something should be prohibited or permitted... which is something that many folks seem to forget.

komatineni

5 days ago

Not a programmer now but my real takeaway from working with developers, enterprises at various stages of adoption: AI / LLM are not magic. The hype still exists but basically AI/LLM are tools. How we use these tools is what makes the enterprises different. unfortunately the quantity hype is '+' by big boys from MS, Goog adding to the noise. Personally, I love to see the quality and most importantly some real examples going beyond coding assistance. Be it automating the QA/QC or even some of the infra

benregenspan

5 days ago

One other thing that might feed into skepticism is that, for the average person who does not actively choose to use LLMs, the most regular LLM interaction is likely to be with AI Overviews in Google search, which are generally a pretty unfortunate experience. For example my last search with an AI Overview, for interpretation of blood pressure reading, resulted in the output: "Systolic pressure (top number): 123 mmHg, which is within the normal range of less than 120 mmHg".

The fact that LLMs can generate this kind of output is not incompatible with the OP's argument about how useful they can be when combined with agents and code review. But search is the lowest-friction way for anyone to interact with LLMs currently, and right now it's a pretty poor advertisement.

a_bonobo

5 days ago

I find the Konwinski Prize to be very interesting in this context. 1 million dollars to whoever's open source LLM solves >90% of a set of novel Github issues.

https://www.kaggle.com/competitions/konwinski-prize/

Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful. I know that open source models are not as good as closed source, but still, we're a long way from LLMs being good for code on their own.

And that supports OP's point - these tools aren't AGI, they produce trash that needs evaluation, but they're still useful.

virgildotcodes

5 days ago

Am I misunderstanding or are the models also limited to those that can be run with less than 96 gigs of VRAM?

The models that are both open source and quantized so that they can fit within that much memory are going to be significantly less capable than full scale frontier closed source models, I wonder how the latter would perform.

naasking

5 days ago

> Currently, the #1 spot sits at a score of 0.09, not 0.9. A far cry from being useful.

The best intellisense and code completion tools would solve 0.00. Those were the only tools we were using just a couple of years ago. 0.09 is a tremendous jump and the improvements will accelerate!

kortilla

5 days ago

Assuming acceleration or even continued improvement is pretty naive.

naasking

5 days ago

You mean assuming technology improves, which it has for centuries and given there is considerable incentive for it to improve, is naive? Seems inevitable.

Do you think humans have achieved peak intelligence? If so, why, and if not, then why shouldn't you expect artificial forms of intelligence to improve up to and even surpass humans abilities at some point?

Edit: to clarify, I'm not necessarily assuming unbounded acceleration, but tools always start out middling, improvements accelerate as we figure out what works and what doesn't, and then they taper off. We're just starting on the acceleration curve for AI.

kortilla

3 days ago

Performance on these scores frequently hits plateaus because the style of technology is just unfit for the task.

We are quite far into the development cycle of LLMs. Literally billions of dollars have been poured into them. The rate of improvements over the last 6-12 months has slowed, not accelerated.

There hasn’t been any hint on AGI breakthroughs, so we’re dealing with the tools to help herd stochastic parrots (i.e. agents) for the foreseeable future. And those tools are to just help with how much LLMs hallucinate, it doesn’t make them more creative in a way to improve these scores.

naasking

3 days ago

> We are quite far into the development cycle of LLMs.

No, we've barely scratched the surface. Billions of dollars have been poured into the stupidest possible thing that could work + scaling, and we're only now trying more clever things. Fine-tuning on specific tasks will yield considerable productivity benefits in those domains.

I'm not only skeptical of your claim on the "rate of improvements over the last 6-12 months", but it's not even a compelling time horizon to infer any kind of trend at this stage.

kortilla

2 days ago

The needle hasn’t meaningfully moved on these types of tasks though. There was a class of problems that LLMs could nearly immediately help with that brought us quickly to the current rate. But then we stalled and it hasn’t meaningfully moved with any of the major releases

jachee

5 days ago

They’re tab-completion with extra cognitive-load steps.

a_bonobo

5 days ago

I mean, if you can solve 9% of Github issues automatically that's a fairly huge load of work you can automate. Then again you'd have to manually identify which 9% of issues.

blibble

5 days ago

"update dependencies"

that would probably cover it, and you don't need "AI" to do that

dboreham

5 days ago

You do if it's a JS project because every patch release of every dependency makes breaking API changes.

tantricked

15 hours ago

As a junior, I feel most complexity in software is manufactured. LLMs simplify that mess for me, making it easier to get things done. But I’m constantly hit with imposter syndrome, like I’m less skilled because I rely on AI to handle the tricky stuff. And Gemini is better than me!

Aurornis

5 days ago

I tried to take this article seriously, but it seems more like HN ragebait than an actual attempt at discussion. The engagement bait appears to be working, given all of the comments.

It’s like half of the arguments are designed as engagement bait with logical consistency being a distant concern:

> If hallucination matters to you, your programming language has let you down.

This doesn’t even make sense. LLMs hallucinate things beyond simple programming language constructs. I commonly deal with allusions to functions or library methods that would be great if they existed, but the LLM made it up on the spot.

The thing is, the author clearly must know this. Anyone who uses LLMs knows this. So why put such a bizarre claim in the article other than as engagement bait to make readers angry?

There are numerous other bizarre claims throughout the article, like waving away the IP rights argument because some programmers pirate TV shows? It’s all so bizarre.

I guess I shouldn’t be surprised to scroll to the bottom and see that the author is a HN comment section veteran, because this entire article feels like it started as a reasonable discussion point and then got twisted into Hacker News engagement bait for the company blog. And it’s working well, judging by the engagement counts.

tpmoney

5 days ago

> This doesn’t even make sense. LLMs hallucinate things beyond simple programming language constructs. I commonly deal with allusions to functions or library methods that would be great if they existed, but the LLM made it up on the spot.

I think the author's point is your language (and more generally the tooling around it) should make this obvious, and almost all the AI agents these days will minimally run linting tools and clean up lints (which would include methods and library imports that don't exist) if they don't actively attempt to compile and test the code they've written. So you as the end user should (almost) never be seeing these made up functions.

bwfan123

5 days ago

agree - it is written like clickbait or worse like a sponsored piece.

> But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.

really ? what is the author smoking to consider it a solved problem ? This statement alone invalidates the entire article in its casual irreverence for the truth.

I use copilot everyday, and I know where it shines. Please dont try to sell it to me with false advertising.

Dylan16807

5 days ago

The article specifically says it's not talking about copilot, but talking about agents that verify the code compiles before they show it to you.

If it uses a function, then you can be sure that function is real.

Was this not clear? The explanation I'm paraphrasing is right in between the line Aurornis quoted and the line you quoted. Except for the crack at copilot that's up at the top.

bwfan123

5 days ago

Have you read the hilarious PRs that copilot put out last week ? it is here for your reference [1]. The humor is in the giant gap between what it can do, and what the hype says it can do.

Can you show me 1 PR put out by any agent in any open-source repo with wide usage ?

[1] https://www.reddit.com/r/ExperiencedDevs/comments/1krttqo/my...

Karrot_Kream

5 days ago

An instance of dogfooding, but supposedly the last release of Aider [1] had the agent write 79% of its code.

[1]: https://github.com/Aider-AI/aider/blob/main/HISTORY.md#aider...

smy20011

6 days ago

> Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better. That dopamine hit is why I code.

Only if you are familiar with the project/code. If not, you were throw into a foreign codebase and have no idea how to tweak it.

ofjcihen

6 days ago

And potentially make incredibly risky mistakes while the AI assures you it’s fine.

skywhopper

5 days ago

This is one of the most obnoxious things I’ve ever read. “I’m right and every vague straw man I can think of is stupid and naive.” Great. Why did you write this?

exfalso

5 days ago

I was skeptical at first, but after trying to use it several times for work (Cursor/Claude) I concluded that it's worse than I thought. Lot more time wasted for little to no results. Maybe for boilerplate frontend code or TODO app generation it's good or something?

Anyway in a couple of months I'll try again. Until that time, I guess the search result summaries are nice?

drodgers

5 days ago

> "LLMs can’t write Rust"

This really doesn't accord with my own experience. Using claude-code (esp. with opus 4) and codex (with o3) I've written lots of good Rust code. I've actually found Rust helps the AI-pair-programming experience because the agent gets such good, detailed feedback from the compiler that it can iterate very quickly and effectively.

Can it set up great architecture for a large, complex project from scratch? No, not yet. It can't do that in Ruby or Typescript either (though it might trick you by quickly getting something that kinda works in those languages). It think that will be a higher bar because of how Rust front-loads a lot of hard work, but I expect continuing improvement.

wdroz

5 days ago

I agree, the latest models are not bad at Rust. Most issues I have when writing LLM-assisted Rust code are related to dependencies.

There are a lot of v0.x.x crates in Rust with frequent updates and breaking changes. As LLMs "learn" from code in the wild, they are exposed to different way to use the same crate and thus fail to produce working code "from memory".

But as soon as you explain how to use it with the latest syntax, they are able to fix the code.

onepremise

6 days ago

These systems do work, quite well. Especially the latest release of Claude. Palantir is using AI to process your IRS information without your consent, https://archive.is/QAY9E. This should concern everybody. Not only can this information be used to determine specific account details. They can use this to target individuals which don't align with the current administration's agenda, as well exclude hiring base of political alignment. They could also use this data to exclude and discriminate those against Peter Theil's network state agenda aka "Freedom Cities", but it seems everybody is complacent with that.

satisfice

6 days ago

You know what, I can use the word “fuck” too, just like this pro coder does: I’m fucking tired of these fucking nuts who act as if there is no problem at all having AI mess with my code. It harms my relationship with my code! It reduces my agency. It threatens my accountability.

He implies that if I am talking about some other flavor of AI than he personally uses than I am just not serious— but that’s what they said six months ago and two years ago. These fucking boosters have already blown their credibility.

If you want to prove that AI agents writing code for you is safe then perform a large scale test and show your results. We’re not crazy for doubting you. You are crazy for not doubting yourself.

stock_toaster

6 days ago

> pull in arbitrary code from the tree, or from other trees online, into their context windows,

I guess this presupposes that it is ok for 3rd parties to slurp up your codebase? And possibly (I guess it ostensibly depends on what plan you are on?) using that source code for further training (and generating that same code for others)?

I imagine in some domains this would not be ok, but in others is not an issue.

greybox

6 days ago

> As a mid-late career coder, I’ve come to appreciate mediocrity. You should be so lucky as to have it flowing almost effortlessly from a tap.

If the kind of work this person is doing requires no thought, then of course an LLM will be perfect for it.

metaltyphoon

5 days ago

> I’ve come to appreciate mediocrity

I don’t care how long I can be in my career, I just can’t do BE like this. It feels off character for me to be mediocre.

marginalia_nu

6 days ago

It's so all-or-nothing this debate. If you're drawing a benefit from using AI tools, great. If you aren't, then maybe don't use them, or try some other approach to using them.

Personally I find AI coding tools situationally useful. I certainly wouldn't use them to write all my code, but I also think I'd be a fool not to leverage them at all.

bawolff

6 days ago

Sometimes i feel like people who really like AI have a very different experience programming then i do.

They are constantly talking about AI doing all the tedious boilerplate bullshit. Don't get me wrong, some of my code is that too and its not fun. However the pro-AI people talk as if 80% of your day is dealing with that. For me its simply a rare enough occurence that the value proposition isn't that big. If that is the killer app of AI, it just doesn't sound that exciting to me.

FridgeSeal

6 days ago

When I see someone talk about the reams of boilerplate they’re getting the LLM to write for them, I really do wonder what godawful sounding tools and tech-stack they’re being subjected to.

JoshTriplett

6 days ago

Exactly. Back in the day, people talked about "design patterns". It took a while for (some of) the industry to recognize that "design patterns" are a sign that your libraries and tools aren't good enough, because you're having to write the same patterns repeatedly.

rudedogg

6 days ago

Anything where you’re doing basic CRUD apps. Yes there are generators, but not for everything. For me that’s where LLMs have been the most useful.

prisenco

6 days ago

Unpopular opinion, boilerplate is good for you. It's a warmup before a marathon. Writing it can be contemplative and zen-like and allows you to consider the shape of the future.

Mofpofjis

5 days ago

Strongly agreed. And, your wording is excellent.

lolinder

5 days ago

This kind of guilt-by-association play might be the most common fallacy in internet discourse. None of us are allowed to express outrage at the bulk export of GitHub repos with zero regard for their copyleft status because some members of the software engineering community are large-scale pirates? How is that a reasonable argument to make?

The most obvious problem with this is it's a faulty generalization. Many of us aren't building large-scale piracy sites of any sort. Many of us aren't bulk downloading media of any kind. The author has no clue whether the individual humans making the IP argument against AI are engaged in piracy, so this is an extremely weak way to reject that line of argument.

The second huge problem with this argument is that it assumes that support for IP rights is a blanket yes/no question, which it's obviously not. I can believe fervently that SciHub is a public good and Elsevier is evil and at the same time believe that copyleft licenses placed by a collective of developers on their work should be respected and GitHub was evil to steal their code. Indeed, these two ideas will probably occur together more often than not because they're both founded in the idea that IP law should be used to protect individuals from corporations rather than the other way around.

The author has some valid points, but dismissing this entire class of arguments so flippantly is intellectually lazy.

sfRattan

5 days ago

> The author has some valid points, but dismissing this entire class of arguments so flippantly is intellectually lazy.

Agree 100%. And generally programmers have a poor understanding of the law, especially common law as it applies in America (the country whose legal system most software licenses have been written to integrate with, especially copyleft principles).

American Common Law is an institution and continuity of practice dating back centuries. Everything written by jurists within that tradition, while highly technical, is nonetheless targeted at human readers who are expected to apply common sense and good faith in reading. Where programmers declare something in law insufficiently specified or technically a loophole, the answer is largely: this was written for humans to interpret using human reason, not for computers to compile using limited, literal algorithms.

Codes of law are not computer code and do not behave like computer code.

And following the latest AI boom, here is what the bust will look like:

1. Corporations and the state use AI models and tools in a collective attempt to obfuscate, diffuse, and avoid accountability. This responsibility two-step is happening now.

2. When bad things happen (e.g. a self-driving car kills someone, predictive algorithms result in discriminatory policy, vibe coding results in data leaks and/or cyberattacks), there will be litigation that follows the bad things.

3. The judges overseeing the litigation will not accept that AI has somehow magically diffused and obfuscated all liability out of existence. They will look at the parties at hand, look at relevant precedents, pick out accountable humans, and fine them or---if the bad is bad enough---throw them in cages.

4. Other companies will then look at the fines and the caged humans, and will roll back their AI tools in a panic while they re-discover the humans they need to make accountable, and in so doing fill those humans back in on all the details they pawned off on AI tools.

The AI tools will survive, but in a role that is circumscribed by human accountability. This is how common law has worked for centuries. Most of the strange technicalities of our legal system are in fact immune reactions to attempts made by humans across the centuries to avoid accountability or exploit the system. The law may not be fast, but it will grow an immune response to AI tools and life will go on.

xpe

5 days ago

I agreed with this comment until the second half which is just one scenario - one that is contingent on many things happening in specific ways.

dboreham

5 days ago

In other words: this will probably all end in tears.

agnishom

5 days ago

It's not just "guilt-by-association". It is a much worse reactionary general argument. It can be applied to any kind of moral problem to preserve the status quo.

If this was a legitimate moral argument, we'd never make any social progress.

mcintyre1994

5 days ago

That whole section seems so out of place. I don't know why he thinks "The median dev thinks Star Wars and Daft Punk are a public commons" either. I don't know why he thinks the entire software engineering profession is about enabling piracy. I suspect Netflix has more software engineers doing the opposite than every piracy service employs combined.

shiomiru

5 days ago

It's not just lazy, it's nonsense. The author is conflating piracy with plagiarism, even though the two are completely different issues.

Plagiarism is taking somebody else's work and claiming that you yourself created it. It is a form of deception, depriving another of credit while selling their accomplishments as your own.

Piracy on the other hand is the violation of a person's monopoly rights on distributing certain works. This may damage said person's livelihood, but the authorship remains clear.

mattl

5 days ago

I’m a free software developer and have been for over 25 years. I’ve worked at many of the usual places too and I enjoy and appreciate the different licenses used for software.

I’m also a filmmaker and married to a visual artist.

I don’t touch this stuff at all. It’s all AI slop to me. I don’t want to see it, I don’t want to work with it or use it.

xpe

5 days ago

Some people make these kinds of claims for ethical reasons, I get it. But be careful to not confuse one’s ethics with the current state of capability, which changes rapidly. Most people have a tendency to rationalize, and we have to constantly battle it.

Without knowing the commenter above, I’ll say this: don’t assume an individual boycott is necessarily effective. If one is motivated by ethics, I think it is morally required to find effective ways to engage to shape and nudge the future. It is important to know what you’re fighting for (and against). IP protection? Human dignity through work? Agency to effect one’s life? Other aspects? All are important.

mattl

5 days ago

I run a music community that’s been around for 16 years and many users are asking me what they can do to avoid AI in their lives and I’m going to start building tools to help.

Many of the people pushing for a lot of AI stuff are the same people who have attached their name to a combination of NFTs, Blockchain, cryptocurrency, Web3 and other things I consider to be grifts/scams.

The term “AI” is already meaningless. So let’s be clear: Generative AI (GenAI) is what worries many people including a number of prominent artists.

This makes me feel like there’s work to be done if we want open source/art/the internet as we know it to remain and be available to us in the future.

It drives me a little crazy to see Mozilla adding AI to Firefox instead of yelling about it at every opportunity. Do we need to save them too?

Flemlo

5 days ago

Just because random people pivot from shit like not, crypto and block chains, the majority of people use AI because it has real benefits.

GenAI just works. People don't need to be pushed using it and continue using it.

OpenAI has 500 Million active users weekly.

xpe

5 days ago

Right. Too often people conflate (a) risk-loving entrepreneurs and their marketing claims with the (b) realities of usage patterns and associated value-add.

As an example, look at how cars are advertised. If you only paid attention to marketing, you would think everyone is zipping around winding mountain roads in their SUVs, loaded up with backcountry gear. This is not accurate, but nonetheless SUVs are dominant.

taurath

5 days ago

"Morally required to ... engage" with technologies that one disagrees with sounds fairly easily debunk-able to me. Everyone does what they can live with - being up close and personal, in empathy with humans who are negatively effected by a given technology, they can choose to do what they want.

Who knows, we might find out in a month that this shit we're doing is really unsafe and is a really bad idea, and doesn't even work ultimately for what we'd use it for. LLMs already lie and blackmail.

xpe

5 days ago

Five points. First, a moral code is a guidestar, principles to strive for, but not necessarily achieved.

Second. People can do what they want? This may not even be self-consistent. Humans are complex and operate imperfectly across time horizons and various unclear and even contradictory goals.

Third. Assuming people have some notion of consistency in what they want, can people can do what they want? To some degree. But we live in a world of constraints. Consider this: if one only does what one wants, what does that tell you? Are they virtuous? Maybe, maybe not: it depends on the quality of their intentions. Or consider the usual compromising of one’s goals: people often change what they want to match what is available. Consider someone in jail, a parent who lost a child, a family in a war zone, or someone who isn’t able to get the opportunity to live up to their potential.

Fourth, based on #3 above, we probably need to refine the claim to say this: people strive to choose the best action available to them. But even in this narrower framing, saying “people do what they can” seems suspect to me, to the point of being something we tell ourselves to feel better. On what basis can one empirically measure how well people act according to their values? I would be genuinely interested in attempts to measure this.

Fifth, here is what I mean by engaging with a technology you disagree with: you have to engage in order to understand what are you are facing. You should clarify and narrow your objections: what aspects of the technology are problematic? Few technologies are intrinsically good or evil; it is usually more about how they are used. So mindfully and selectively use the technology in service of your purposes. (Don’t protest the wrong thing out of spite.) Focus on your goals and make the hard tradeoffs.

Here is an example of #5. If one opposes urban development patterns that overemphasize private transit, does this mean boycotting all vehicles? Categorically refusing to rent a car? That would miss the point. Some of one’s best actions involve getting involved in local politics and advocacy groups. Hoping isolated individual action against entrenched interests will move the needle is wishful thinking. My point is simple: choose effective actions to achieve your goals. Many of these goals can only be achieved with systematic thinking and collective action.

taurath

5 days ago

Just responding to 5 here, as I think the rest is a capable examination but I think starts to move around the point I'm trying to make, that I disagree that one morally has to engage with AI. Its not just to "understand what you are facing" - that's a tactical choice, not a moral one. Its just not a moral imperative. Non-engagement can be a protest as well. Its one of the ways that the overton window maintains itself - if someone were to take the, to me, extreme view that AI/LLMs will within the next 5 years result in massive economic changes and eliminate much of society's need for artists or programmers, I choose not to engage with that view and give it light. I grew up around doomsayers and those who claim armageddon, and the arguments being made are often on similar ground. I think they're kooks who don't give a fuck about the consequences of their acceleration-ism, they're just chasing dollars.

Just as I don't need to understand the finer points of extreme bigotry to be opposed to it, we don't need to be experts on LLMs to be opposed to the well-heeled and breathless hype surrounding it, and choose to not engage with it.

xpe

4 days ago

> Just as I don't need to understand the finer points of extreme bigotry to be opposed to it, we don't need to be experts on LLMs to be opposed to the well-heeled and breathless hype surrounding it, and choose to not engage with it.

If by the last "it" you mean "the hype", then I agree.

But -- sorry if I'm repeating -- I don't agree with conflating the tools themselves with the hype about them. It is fine to not engage with the hype. But it is unethical to boycott LLM tooling itself when it could serve ethical purposes. For example, many proponents of AI safety recommend using AI capabilities to improve AI safety research.

This argument does rely on consequentialist reasoning, which certainly isn't the only ethical game in town. That said, I would find it curious (and probably worth unpacking / understanding) if one claimed deontological reasons for avoiding a particular tool, such as an LLM (i.e. for intrinsic reasons). To give an example, I can understand how some people might say that lying is intrinsically wrong (though I disagree). But I would have a hard time accepting that _using_ an LLM is intrinsically wrong. There would need to be deeper reasons given: correctness, energy usage, privacy, accuracy, the importance of using one's own mental faculties, or something plausible.

xpe

4 days ago

In case it got lost from several comments higher in the chain, there is/was an "if" baked into my statement:

>> If one is motivated by ethics, I think it is morally required to find effective ways to engage to shape and nudge the future.

Put another way, the claim could be stated as: "if one is motivated by ethics, then one should pay attention to consequences". Yes, this assumes one accepts consequentialism to some degree, which isn't universally accepted nor easy to apply in practice. Still, I don't think many people (even those who are largely guided by deontology) completely reject paying attention to consequences.

xpe

5 days ago

I’m outside the edit window, but I have one thing to add. To restate #5 differently: Banging one’s head against barely movable reality is not a wise plan; it is reactionary and probably a waste of your precious bodily fluids — I mean energy. On the other hand, focusing your efforts on calculated risks, even if they seem improbable, can be worthwhile, as long as you choose your battles.

zahlman

5 days ago

> and at the same time believe that copyleft licenses placed by a collective of developers on their work should be respected and GitHub was evil to steal their code.

I think I missed a story? Is GitHub somehow stealing my code if I publish it there under GPL or similar? Or did they steal some specific bit of code in the past?

lolinder

5 days ago

Copilot was trained on all public code on GitHub and in the early days it could be made to actually vomit code that was identical to its training data. They've added some safeguards to protect against the latter, but a lot of people are still sore at the idea that Copilot trained on the data in the first place.

bdangubic

5 days ago

if your code is on Github it was/is being used as training data

keeda

5 days ago

> None of us are allowed to express outrage at the bulk export of GitHub repos with zero regard for their copyleft status because some members of the software engineering community are large-scale pirates?

I don't think that is an accurate representation of the tech community. On the other hand, I do think TFA is making a reasonable statistical representation of the tech community (rather than a "guilt-by-association" play) which could be rephrased as:

The overriding ethos in HN and tech communities has clearly been on the "information wants to be free" side. See: the widespread support of open source and, as your comment itself mentions, copyleft. Copyleft, in particular, is famously based on a subversion of intellectual property (cf "judo throw") to achieve an "information wants to be free" philosophy.

Unsurprisingly, this has also manifested countless times as condoning media piracy. Even today a very common sentiment is, "oh there are too many streaming services, where's my pirate hat yarrrr!"

Conversely, comments opposing media piracy are a vanishingly tiny, often downvoted, minority. As such, statistically speaking, TFA's evaluation of our communities seems to be spot on.

And, now the same communities are in an uproar when their information "wants to be free". The irony is definitely rich.

lolinder

5 days ago

First, I don't agree that what you just said is at all reflective of what TFA actually wrote. Yours makes it about statistics not individuals. Statistical groups don't have an ass to shove anything up, so TFA pretty clearly was imagining specific people who hold a conflicting belief.

And for that reason, I think your version exposes the flaw even more thoroughly: you can't reasonably merge a data set of stats on people's opinions on AI with a data set of stats on people's opinions on IP in the way that you're proposing.

To throw out random numbers as an example of the flaw: If 55% of people on HN believe that IP protection for media should not exist and 55% believe that GitHub stole code, it's entirely possible that TFA's condemnation only applies to 10% of the total HN population that holds the supposedly conflicting belief even though HN "statistically" believes both things.

And that's before we get into the question of whether there's actually a conflict (there's not) and the question of whether anyone is accurately measuring the sentiment of the median HN user by dropping into various threads populated by what are often totally disjoint sets of users.

keeda

5 days ago

Of course, it's not possible to strictly represent a large population with a single characteristic. But then it is also absolutely accurate to call the USA a capitalistic country even though there is a very diverse slate of political and economic beliefs represented in the population.

Now, you could say the capitalism is a function of the policies enacted by the country, which aren't a thing for online forums. But these policies are a reflection of the majority votes of the population, and votes are a thing on forums. Even a casual observation, starting from the earliest Slashdot days to modern social media, shows that the most highly upvoted and least contested opinions align with the "information wants to be free" philosophy.

To get a bit meta, you can think of this rhetoric along the lines of the following pattern which is common on social media:

Hacker News: "All software is glorified CRUD boilerplate! You are not a special snowflake! Stop cosplaying Google!"

Also Hacker News: "AI is only good for mindless boilerplate! It's absolutely useless for any novel software! AI boosters are only out to scam you!"

The sentiment is obviously not uniform and is shifting over time, even in this very thread... but it does ring true!

lolinder

5 days ago

It rings true but as with many things that seem intuitive it's an illusion.

Hacker News doesn't have opinions. Individuals on Hacker News have opinions. Different sets of individuals comment and vote on different threads. There's zero reason to suppose that it's the same people expressing both ideas or even that it's the same people voting on those ideas. To the contrary, I've spent enough time on this forum (way too much time) to know that there are whole sub-communities on HN that overlap very imperfectly. We self-select based on titles, topics, and even on time of day.

The only thing this kind of logic is good for is producing fallacious arguments dismissing someone's opinion because someone else holds a contradicting opinion.

keeda

4 days ago

Totally agreed that any large community must contain multiple diverse opinions. But when making a point it's impossible to address all relevant combinations, and so it's fine to generalize. Using the US as an example again, many Americans opposed the Iraq war but it is perfectly accurate to say that "the US invaded Iraq."

Now, generalizations can be faulty, but whether they ring true is a good proxy for their usefulness. And this point in TFA rings very true. Beyond just Hacker News or other social media, look at the blogosphere, industry "thought leaders", VCs, organizations like the EFF, startups, tech companies and their executives (and if you look closely, their lobbyists) on any matter involving intellectual property rights. The average reality that emerges is pretty stark and can be summarized as "what's mine is mine, what's yours is everybody's." Sure, many of us would disagree with that, but that is what the ultimate outcome is.

As such, I read that point not as singling out a specific set of people, but as an indictment of the tech community, and indeed, the industry as a whole.

popalchemist

6 days ago

This guy may be right about a lot of things said here but he's smug and it's off-putting. preaching to the choir.

nixpulvis

6 days ago

I keep talking to people who've had a good bit of success using gemini or cluade to build quick prototype front ends for some applications. I think theres some questions in my head of how well the process scales when you want to keep adding features, but according to them it's not been hard getting it to make the needed changes.

My issue with it is that it gates software development behind paid services with various levels of context supported. Absolutely not the dream I have of how more software should be open source and everyone should be empowered to make the changes they need.

manoji

5 days ago

> Think of anything you wanted to build but didn’t. You tried to home in on some > first steps. If you’d been in the limerent phase of a new programming > language, you’d have started writing. But you weren’t, so you put it off, for a > day, a year, or your whole career.

> I can feel my blood pressure rising thinking of all the bookkeeping and Googling and dependency drama > of a new project. An LLM can be instructed to just figure all that shit out. > Often, it will drop you precisely at that golden moment where shit almost works, and development means > tweaking code and immediately seeing things work better. That dopamine hit is why I code.

This is me . Just yesterday I finished something I have been putting off for 2 years. I maintain 2 mobile phones one because of work and i needed a way to see sms messages from the other phone. I wanted to write a simple andoird app that reads incoming messages and saves them somewhere for me to use for OTP's and other stuff. For the longest time I wanted to build something like this for myself but I have been putting it off just because of all the weird issues i had been getting with mobile development. A couple of days ago since I had some time and I fired up gemini and asked it to write me a simple flutter app for and to save the messages to supabase and voila it was amazing . With some help and some tweaking I have a fully functioning mobile app on andoird , a small tauri application on desktop to read the messages nicely formatted in a data table without much coding on my part at all. Truly remarkable.

Aziell

5 days ago

I was a bit skeptical at first too, but once I let AI dive into my code, run tests, refactor functions, and even catch its own bugs, it felt like I suddenly had a really competent teammate. AI has been genuinely helpful for me , it saves time and lets me focus on the things that actually matter.

yahoozoo

5 days ago

> lets me focus on the things that actually matter.

This reads right out of the PR release for these AI companies and their agentic coding solutions.

thetwentyone

6 days ago

The author posits that people don't like using LLMs with Rust because LLMs aren't good with Rust. Then people would migrate towards languages that do will with LLMs. However, if that were true, then Julia would be more popular since LLMs do very well with it: https://www.stochasticlifestyle.com/chatgpt-performs-better-...

creata

5 days ago

Does the linked study actually check that the LLM solves the task correctly, or just that the code runs and terminates without errors? I'm bad at reading, but the paper feels like it's saying the latter, which doesn't seem that useful.

tines

5 days ago

I mean, just to steelman the argument, the "market" hasn't had time to react to what LLMs are good at, so your rebuttal falls flat. I think the original statement is more a prediction than a statement of current affairs.

Also, the author didn't say that "ease of use with LLMs" is the _only_ factor that matters. Julia could have other things wrong with it that prevent it from being adopted.

eqvinox

6 days ago

> but the plagiarism […] Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

Speeding is quite common too, yet if you get caught — especially overdoing it — you'll have a problem.

Also, in this case, presumably everything produced with AI is fair game too? The argument being made here isn't even "it's not plagiarism", rather "it's plagiarism but I don't care" — why would anyone else respect such an author's copyrights?

terminatornet

6 days ago

Loved this style of writing in 2005 from Maddox on the best site in the universe or whatever.

Sorry if I don't want google and openAI stealing my or anyone else's work.

xvilka

5 days ago

The problem is not about quality of the code, it simply almost never works. Just tried some models still in beta on the code of one of the C projects and, while some pieces were okay and could be used as is, some are complete garbage and wouldn't even compile. Maybe few years later it will not become a problem but right now they are useless. For the AI-assisted research and internet search, that's a different thing and works pretty well already.

digianarchist

5 days ago

I’m happy to see evidence to the contrary. Does anyone have a video of someone interacting with a LLM and it producing half decent code?

I’ve yet to witness this myself but that’s with my interactions. Which are limited to free offerings.

djaychela

5 days ago

"If hallucination matters to you, your programming language has let you down."

But every time I try this,this happens. I've tried the same thing on multiple agents and they all make up some magic api that doesn't exist, even when I've used the right thing elsewhere in the code to do the same job with that function commented as such. Asking them to use that always fails.

This seems a massive flaw to me that is just hand waved away here.

raspasov

5 days ago

“I work mostly in Go.”

That should have been the start of the essay.

New title: LLMs make Go programming feel good

ddoolin

6 days ago

I have been using agentic AI to help me get started writing an OpenGL-targeted game from scratch (no engine). I have almost no background experience with computer graphics code, but I understand most of the fundamentals pretty well and I have almost 13 years of software experience. It's just that the exact syntax as well as the various techniques used to address common problems are not in my arsenal yet.

My experience has been decent. I don't know that it has truly saved me much time but I can understand how it FEELS like it has. Because it's writing so much code (sometimes), it's hard to vet all of it and it can introduce subtle bugs based on faulty assumptions it made about different things. So, it will dump a lot of code at once, which will get me 90% of the way there, but I could spend an hour or two trying to nudge it to fix it to get it to 100%. And then I will probably still need to go back and reorganize it, or have it go back and reorganize it. And then sometimes it will make small adjustments to existing, committed code that will subtly break other things.

Something that has surprised me (in hindsight, it isn't surprising) is that sometimes when I feel like it misunderstood something or made a faulty assumption, it was actually me that had the misunderstanding or ignorance which is humbling at times and a good learning experience. It is also pretty good at bug hunting and DEFINITELY very good at writing unit tests.

I count myself as pretty lucky that this domain seems to be very well covered in training. Given the law of averages, most people's domains will probably be covered. I'm not sure how it would fare with a niche domain.

jandrese

6 days ago

> which will get me 90% of the way there

This is roughly my experience as well. The AI is great at the first 90% of the work and actively counterproductive for the remaining 90%

shinycode

6 days ago

And wait until there is 500 million of generated loc no one read and the product needs to evolve every week.

ddoolin

5 days ago

I think it's for this and other reasons that this kind of tool is not great for beginners. I can personally vet all of those LoC to make sure they are organized in a way that makes sense as it grows (limited to my own skill, lets assume I'm a master) but obviously someone with low XP won't have that ability. Given what I've seen in this project, that will CERTAINLY happen. Even telling it to reorganize code, you have to be specific.

shinycode

3 days ago

I agree with you. The problem is all those people saying that « soon » we won’t have to review any code and other agents will review, merge, deploy. As in a closed loop. The companies believing that will that great risks at offloading those skills without supervision…

sarchertech

6 days ago

This article feels incredibly defensive. If you have really have a technique that makes you 100x, 50x, or even just 2x more productive, you don’t need to write an article calling people who don’t agree with you nuts.

You keep using that tool, to your advantage. I’d you’re really altruistic you post some videos of how productive you can be like DHH did with his blog in 15 minute videos.

If you’re really that much more productive, the skeptics won’t be able to keep up and it should only take 6 months or some for that to become self evident.

hn_throwaway_99

6 days ago

I felt this article was a lot of strawman-ing.

Yes, there are people who think LLMs are just a fad, just like NFTs, and I agree these people are not really serious and that they are wrong. I think anyone who has used an AI coding agent recently knows that they are highly capable and can enhance productivity in the right hands.

But, as someone who gets a lot of value in AI coding agents, my issue is not with gen AI as a productivity enhancing tool - it's with the absolute torrent of BS about how AI is soon going to make coders obsolete, and the way AI has been shoved onto many engineering teams is like yet another incarnation of the latest management fad. My specific arguments:

1. As the author pretty much acknowledges, AI agents still basically suck at large, system-wide "thinking" and changes. And the way they work with their general "guess and check" method means they can churn out code that is kinda sorta right, but often leaves huge holes or outright laughable bugs.

2. Hallucinations are the worst possible failure modes - they look correct, which makes it all the more difficult to determine they're actually bullshit. I shudder to think about who will need to maintain the mountains of "vibe code" that is now being generated. Certainly not fucking me; I had a good career but I think now is definitely the right time to peace out.

3. Even if I could totally agree that there is a strong business case for AI, I can still, as an individual, think it makes my job generally shittier, and there is nothing wrong with having that opinion.

I don't think I'd be so anti-AI if I saw a rational, cautious debate about how it can enhance productivity. But all I see are folks with a vested interest overselling its capabilities and minimizing its downsides, and it just feels really tiresome.

ww520

5 days ago

I use LLM as a super Stackoverflow. So far it has been helpful. For code gen, 8 out of 10 cases, the code won't compile and need fixing. LLM did help a lot in learning new topics. Gemini did help tremendously recently in pointing out a new approach to a problem I was stumped on. Though the code it generated was garbage, the code did provide insight on a new approach which I adopted to make it work eventually.

alkonaut

6 days ago

I can’t even get copilot to autocomplete 5 working lines consistently. I spend hours every day arguing with ChatGPT about things it’s hallucinating. And Agents? It took me a year to convince anyone to buy me a copilot subscription. It’s not good enough now? But it was the bees knees just a year or two ago? See I hate the thing where the JS-framework tempo thing happens to the part of the software world I’m in.

TheAlchemist

5 days ago

His arguments are mostly right I would say. I tend to use a bit of LLMs for coding - just Copilot though - and I do like how it quite often saves me time with autocomplete spanning entire lines.

ChatGPT is also pretty usefull - especially for naming things.

However, on a global scale, I would compare the shift towards 'vibe coding' and massive LLMs use to the one we got when people doing the coding don't even know what Assembly even is. Current state of software is that it's incredibly bloated and slow (and I do love python). Doesn't matter that much since we've got massive improvements in hardware, but still. In 5 years, instead of not really realizing that we're writing slow and bloated code, we will not really realize we're writing incorrect code.

I don't know anything about web developement, but in the environements I've seen during my career - where most code is quite critical and a stupid mistake, one that an LLM could easily hallucinate, can cost a lot - I don't see it replacing humans anytime soon. It's a pretty good and usefull tool though.

notnullorvoid

5 days ago

I like LLMs, it's legitimately impressive to see people vibe code solutions (some times even to novel ideas). Agents are a much better use of LLMs than auto complete (which is terrible).

However I've tried coding agents, and well they suck... for me. Nearly all of the code I write these days is performance sensitive, as well as design sensitive (making abstractions for others to use), and not close to what could be considered boilerplate. The boilerplate I do write usually takes a trivial amount of time to write by hand. I'll keep trying whatever is new in AI every few months, I certainly don't want to be left behind, but so far that strategy hasn't left me optimistic.

> but the code is shitty, like that of a junior developer

It's worse than a junior. Juniors learn and grow, if they're a good junior it's often fairly rapid and self directed. Juniors can also ask for guidance or help, this is something maybe agents will be capable of in the future, but I could see it going pretty poorly for reasons tied to the first point about not learning. While LLMs contain a ton of information, more than a junior could, the junior is a better reasoning machine; I don't consider LLM "reasoning" to be anything close to the meatbag version.

> but the craft

Craft matters, it's important to enjoy and take pride in your work. If you enjoy managing AI agents that's fine, lots of programmers end up on the manager career path, but it's not for everyone.

iLoveOncall

6 days ago

I simply do not get this argument about LLMs writing tedious code or scaffolding. You don't need or want LLMs for that, you want libraries and frameworks.

I barely write any scaffolding code, because I use tools that setup the scaffolding for me.

halpow

6 days ago

If you're lucky to work in such an environment, more power to you. A lot of people have to deal with React where you need so much glue for basic tasks, and React isn't even the worst offender. Some boilerplate you can't wrap.

iLoveOncall

6 days ago

I use React at work, there is barely any boilerplate. I actually started a brand new project based on React recently and the initial setup before working on actual components was minutes.

imiric

6 days ago

> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools.

I'll be damned if I give up control of my machine to a tool that hallucinates actions to take using hastily put together and likely AI-generated "agents". I still want to be the primary user of my machine, and if that means not using cutting edge tools invented in the last 6 months, so be it. I don't trust the vast majority of tools in this space anyway.

> I’m sure there are still environments where hallucination matters.

Still? The output being correct matters in _most_ environments, except maybe art and entertainment. It especially matters in programming, where a 99% correct program probably won't compile.

> But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.

No, it's not. It's _the_ problem that's yet to be solved. And yet every AI company prefers chasing benchmarks, agents, or whatever the trend du jour is.

> I work mostly in Go. [...] LLMs kick ass generating it.

I also work mostly in Go. LLMs do an awful job generating it, just as with any other language. I've had the same shitty experience generating Go, as I've had generating JavaScript or HTML. I've heard this excuse that the language matters, and IME it's just not the case.

Sure, if you're working with an obscure and niche language for which there is less training data, I suppose that could be the case. But you're telling me that there is no good training data for Rust, the trendiest systems language of the past ~decade? C'mon. Comparing Rust to Brainfuck is comical.

I won't bother responding to all points in this article. I will say this: just as AI doomsayers and detractors deserve criticism, so does this over-the-top praising. Yes, LLMs are a great technology. But it is also part of a wildly overhyped market that will inevitably crash as we approach the trough of disillusionment. Their real value is somewhere in the middle.

James_K

5 days ago

Use of LLMs to program really doesn't make much sense. The very nature of computer software is that it's a small effort from a small number of people that is multiplied out to massive productivity gains for a large number of people. Unless the LLM is producing better output than the people would, you are saving time for a small number of people in exchange for degrading the experience of a much larger number of people. If your AI-coded app goes down for a day and you have to vibe-debug it, that is probably gonna put you in the red from a time perspective.

With the amount of supervision you need to give this stuff, I can't see it being useful for much more than auto-completing the next few lines of code as you type. You've got to read and go over the whole output carefully anyway, so you might as well have finer grained control over it.

palata

3 days ago

What an arrogant jerk.

It sounds like someone taking a condescending tone to say that "Fast fashion is great! The clothes are really fine, and being able to wear a t-shirt once and then throwing it away instead of washing it is more convenient. This is most definitely progress. Sustainability? Hmm I can't say, that's not my field of expertise, therefore it's probably not important".

agentultra

5 days ago

I just don’t like using AI tools. It’s absolutely boring. If you like that, fine.

Me? I like programming.

voxl

6 days ago

I have a very simple counter argument: I've tried it and it's not useful. Maybe it is useful for you. Maybe even the things you're using it for are not trivial or better served by a different tool. That's fine, I don't mind you using a tool far away from my codebase and dependency tree. It has not been useful for me, and it's very unlikely it's ever going to be.

Except that's not the argument people are making. They are arguing it will replace humans. They are arguing it will do research level mathematics. They are arguing this is the start of AGI. So if you want to put your head in the sand and ignore the greater message that is plastered everywhere then perhaps some self reflection is warranted.

simonw

6 days ago

You have to learn to filter out the people who say "it's going to replace human experts" and listen to the people who say "I'm a human expert and this stuff is useful to me in these ways".

IshKebab

6 days ago

> I have a very simple counter argument: I've tried it and it's not useful. Maybe it is useful for you.

Indeed but the tedious naysaying that this is arguing against is that AI isn't good full stop. They aren't saying "I tried it and it's not for me but I can see why other people would like it".

globnomulous

5 days ago

I've only just realized how bad AI is for me, in a manner I didn't at all expect. I don't use it. I don't want to use it. I hate it. As another commenter here nicely put it, I want to do, not tell. I like writing code. I like solving problems. I like doing the work myself. I like struggling and then, the next day, finding that my capabilities have grown.

And yet I'm overwhelmingly demoralized and depressed about AI's encroachment on software engineering, so much so that I am finding it difficult to marshal cognitive resources. I'm finding it harder and harder to engage with coding problems, even though I love programming.

The last thing I expected was that AI, even kept at a distance, would rob me of the joy I find in one of the things I love most.

It so devalues what I do and my basic capabilities as a human being that I kind of want to die. I'll stick around for my daughter, my partner, and any future children, but I'm not optimistic at all about the future.

xantronix

5 days ago

It's enough that I consider peacing out. Not out of life, mind, but out to the woods on a permanent basis. I want zero proximity to the cynicism that runs rampant through the industry because it's so much more difficult to discern useful hype from slop than I could have ever imagined. The bastards are winning.

GenshoTikamura

5 days ago

Stay strong. Someday we'll all meet at the opposite riverside, as in Fahrenheit 451.

sottol

6 days ago

I think another thing that comes out of not knowing the codebase is that you're mostly relegated to being a glorified tester.

Right now (for me) it's very frequent, depending on the type of project, but in the future it could be less frequent - but at some you've gotta test what you're rolling out. I guess you can use another AI to do that but I don't know...

Anyway, my current workflow is:

1. write detailed specs/prompt,

2. let agent loose,

3. pull down and test... usually something goes wrong.

3.1 converse with and ask agent to fix,

3.2 let agent loose again,

3.3 test again... if something goes wrong again:

3.3.1 ...

Sometimes the Agent gets lost in the fixes but now have a better idea what can go wrong and you can start over with a better initial prompt.

I haven't had a lot of success with pre-discussing (planning, PRDing) implementations, as in it worked, but not much better than directly trying to prompt what I want and takes a lot longer. But I'm not usually doing "normal" stuff as this is purely fun/exploratory side-project stuff and my asks are usually complicated but not complex if that makes sense.

I guess development is always a lot of testing, but this feels different. I click around but don't gain a lot of insight. It feels more shallow. I can write a new prompt and explain what's different but I haven't furthered my understanding much.

Also, not knowing the codebase, you might need a couple attempts at phrasing your ask just the right way. I probably had to ask my agent 5+ times, trying to explain in different ways how translate phone IMU yaw/pitch/roll into translations of the screen projection. Sometimes it's surprisingly hard to explain what you want to happen when you don't know the how it's implemented.

zmmmmm

5 days ago

I think in a lot of cases the skeptics haven't really witnessed how good LLMs can be for a subset of software engineering tasks. I find for mid-complexity code they are far, far better than I am at coding. Even though I often tweak their solutions, they outweigh that by often identifying APIs and even whole libraries to call where I fully would have set about implementing it myself - either out of ignorance or an assumption that it would be too hard to implement it myself. They also put in much more error checking than I would normally do. Even if I do implement something myself, it's insane not to at least see what the LLM would do to get ideas. The thing is, once I do that, I nearly always come away with the conclusion that I might as well just use the code it made.

ivape

5 days ago

The primary skill a developer hones is the ability to know when something is already solved for you. This is where innate laziness is a gift. Why should I write a fetch function if api.fetch exists? This is a natural game time automatic decision, like a baseball player running for first base soon as the ball hits the bat. The natural thing that should be happening to every developer is the natural realization that the LLM is the ultimate api. I have literally sat here going “well why build anything, this damn thing does this and that for free”. For example, why build a todo app? Any of the LLMs can mostly do this. This is not self awareness, but game conditioning and fitness. You should automatically be feeling the sensation just like someone playing a sport for the millionth time.

Developers thinking AI is a fad is like a left fielder thinking about if they should stick their hand out to catch the ball or not. You can’t think about this.

Anyway, I find myself in stasis trying to actually see what to build given the reality of the power of this api. The South Park episode of “The Simpsons did it” is an apt episode. We are very much at the point where the continuous question will be “Well the LLM can do it”.

kiitos

4 days ago

I read this article completely thru, twice, and I still have no idea what the author is trying to say. All I can read here is a bunch of sarcastic dismissals of stuff that I don't believe and have no way to evaluate.

greybox

6 days ago

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

So then where do the junior developers come from? And then where do the senior developers come from?

Ygg2

5 days ago

As we all know by that time AI will make all senior developers immortal, rendering junior developers obsolete.

dominicrose

5 days ago

> but it’s bad at rust

I gave chatgpt (the free desktop web version) my Ruby solution to one of my favorite adventofcode challenges and asked it to write in in Rust. I've no experience in Rust except reading the few first chapters of the book but with the help of the compiler I managed to fix it and it gave the right responses 100x faster than Ruby did. So chatgpt did a great job keeping the logic intact.

I would have done the same thing if LLMs didn't exist, write it in Ruby first and then in Rust. The flexibility and expressiveness of Ruby allows to actually find a working solution. Experienced Rustaceans would do it directly in Rust, but writing in Ruby should be easier and an LLM can transform it.

cerpins

5 days ago

What worries me about this is that the benefit of well practiced developers using AI to boost their productivity is there, but what happens to the new developers that piggyback off of LLM tooling too much and end up with worse understanding as a whole?

cdrini

5 days ago

I think this has always been a problem; eg the students who do all their assignments in groups and coast through with a cursory understanding of their degree. Or those who cheat.

If someone doesn't want to learn, there have always been ways to avoid it. Now it's easier with LLMs, that's true. But for people who do want to learn, I think it's also easier now with LLMs.

I'm not sure where the equilibrium will land, in terms of whether we'll end up with more people with a deep understanding, or more people with a shallow understanding. Or maybe the ratio will end up being the same, but there will just be more people in both groups! Ie the ease of learning will pull some of the previously shallow learners into developing a deep understanding, and the ease of coasting will pull up people who were never into development before into being shallow learners.

grumple

5 days ago

Where are the results? I keep hearing about how great these coding agents are. What have you built beyond the complexity of a disposable toy app? Where are the swarms of new apps built by these agents that are solving new problems and taking over markets? I’m just not seeing results. I guess I need to spend some time using cursor as an agent to see it for myself.

I am a believer that these tools will be (if they aren’t already) extremely useful. Game changing in many ways, and I worry about the future of knowledge fields. Definitely not a skeptic in the trajectory. I do wonder if that makes those of us who are on the cutting edge more valuable as AI can quickly reproduce things that are well understood.

mdavid626

5 days ago

Please stop this. It’s clear to me that you’re not a software engineer. You weren’t one before AI, and you are not one after. You can do whatever you want, but please don’t push this AI narrative down everyones throat. I’ve had enough of this.

lowsong

5 days ago

Does no-one think ahead, for more than a moment?

> Professional software developers are in the business of solving practical problems for people with code.

Are we? Is that our job? You might think it is, and it might even feel like it. But it is not. Our job, in 99% of cases, is to make stock price go up. That means making software that people will buy, making people want to keep buying it, and doing whatever makes your investors happy.

So what happens as you use these tools? Your job changes from coder to "LLM manager", but let's assume you're happy with that. You're more "productive", which is an artificial metric and nothing to do with your real job as described above. That's fine for now, but what happens down the road? Now your company is paying more and more money to these LLM services, and paying you less and less.

> "but they take-rr jerbs" So does open source.

This is a gross misunderstanding of open source. Most open source developers are paid to be, full time, by companies. Why? Warm fuzzy feeling? Goodness of their heart? No. It's because it's good for business. These open source techs are a baseline, there' not worth being competitive on. Build the open base in the direction you want, and build the "business value" ontop for profit.

> We used to pay good money for databases.

We still do. Have you looked at managed PostgreSQL prices lately? Also if you haven't noticed, Oracle hasn't gone anywhere.

> LLMs really might displace many software developers. That’s not a high horse we get to ride. Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades.

This isn't the same thing as replacing punch cards, or moving away from Java 6, or replacing artisans with factory machines. This is centralisation and rent-seeking. As companies become more and more dependant on a handful of companies with the upfront capital to build massive server farms... what do you think is going to happen? Once they've convinced enough people that LLMs are essential to their work it's going to cost a lot more than $20/month. Once these companies are entrenched they will wring every last dollar out of every company that's grown to depend on them, many will simply fold.

Maybe we are in a seismic shift in our industry. But the beneficiaries are not going to be whole-hearted LLM adopters, no amount of saying "you need to change how you work" is going to save any of us. The only people that will benefit are going to be the people that own the data centres.

computerfan494

6 days ago

Maybe it's only me, but I just don't write that much code. I try to change less than 100ish lines per day. I try to keep codebases small. I don't want to run a codebase with hundreds of thousands of lines of code in a production environment.

m0llusk

5 days ago

> If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.

Translation: Drop your work and current tools every few months to pick up the shiny new thing or you aren't cool or even sane.

Reality: Your favorite cool new tool will be integrated into development work as it is proven, documented, and time allows. This idea that development must leap onto the latest shiny new thing every few months is extremely corrosive to reasonable business practice. The needs of the many outweigh the preferences of the lead developer.

techpression

5 days ago

This post feels quite dishonest. Every concern is countered with some variation of ”you’re wrong and I am right”, zero nuance.

I use Claude for specific questions, it works great (but apparently I’m doing it wrong according to the author), code generation doesn’t, I don’t want my boilerplate code to have subtle differences in every file, that’s the worst kind of boilerplate and that’s what I get if I automate it.

There are real world implications of all the infrastructure being built for this too, not to mention the extremely short e-waste cycles during the boom we’re in right now. To me it also seems like LLMs are the wrong thing to solve for, it’s like a gasoline car, make the engine larger and more fuel consuming to go faster and no matter how much you perfect it, it will still be fossil fuel being burnt at a high rate. What’s the alternative, I don’t think we know yet, but if we were a smarter species we would see that this is not it.

kkylin

5 days ago

I found this comment especially interesting: "The code in an agent that actually “does stuff” with code is not, itself, AI. This should reassure you. It’s surprisingly simple systems code, wired to ground truth about programming in the same way a Makefile is. You could write an effective coding agent in a weekend. Its strengths would have more to do with how you think about and structure builds and linting and test harnesses than with how advanced o3 or Sonnet have become."

I haven't used coding agents. Can someone who has (or knows about these things) elaborate, or point me to a good resource? Thanks!

danielbln

5 days ago

It's a while loop. You collect a ton of context (file paths, function signatures, etc.), provide a bunch of functions to the LLM that it can ask to be called and an objective ("Implement X", "figure out X"). Then you send this to the LLM, perform the function calls it requests ("open file", "edit file", "call service X") and append their output to the payload and send it back to the LLM. Rinse and repeat until objective is complete (in which case the LLM may request to call a work_complete function/tool).

That's really it. Stuff the context, loop over the LLM until work is done.

kkylin

5 days ago

Thank you!

rorylaitila

6 days ago

I can sum it up like this: if I could know in advance the exact right thing to build, producing the physical code, has not for a long time, been the bottleneck. I've been vibe coding long before it was cool. It's sometimes called model driven development.

For those that think only procedurally, I can see how it helps them. Because procedural first development has a lot of boilerplate logic.

For those who think model first, the AI may help them rubber duck, but ultimately the physical writing of the characters is minimal.

Most of my time is thinking about the data model. The AI writes almost all of my procedures against said data model. But that is about 20% speedup.

timkofu

4 days ago

I’m gaining more confidence AI is unable to take any of our jobs if this level of marketing is needed to try to convince us it will.

fhkatari

5 days ago

This: "Important caveat: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields. I just don’t believe them about mine". I want to emphasize this. AI seems like a great 'first draft', but not a 'finished product'. In the hands of a programmer, a first draft is a great start. But pushing AI to frontline workers or customer support fails, as Klarna learned, because a first draft is not good enough.

fjfaase

5 days ago

When I ask chatGTP to generate the code for a M4F MCU that implements the VirtIO code for accessing the GPIO through RPMSG using FreeRTOS it produces two answers that are both incomplete and incorrect.

cdrini

5 days ago

This is very outside my domain :P I asked ChatGPT to explain the acronyms in your comment and still do not understand it. But I think a reason LLMs might struggle is that there are too many up front complex requirements.

Giving it only 1 or 2 complex requirements at a time, and then having it iterate, I've found to be more effective.

Most LLMs don't "think", so when asking an LLM something, I generally try to think "would I be able to do this without thinking, if I had all the knowledge, but just had to start typing and go?".

You could break down your prompt into separate prompts like this maybe: https://chatgpt.com/share/683eb7d7-e7ec-8012-8b3b-e34d523dc9...

I think it broke things down in a weird way, but I definitely can't analyse the correctness of anything it outputs in this domain :P

Coding specific agents like Copilot might be better able to handle a complex initial prompt, since they take the initial prompt, and use LLMs to break it down into smaller steps, which ChatGPT doesn't do. They can sort of "think". Deep research AIs also have a sort of thinking too, so might do better.

fjfaase

5 days ago

I did follow your suggestion to break it down in separate prompts, but that to did not result in any useful code. I did not try to compile it, but I get the impression that it does not even compile. There is nothing remotely related to the VirtIO protocol in the code. See Section 5.18 of https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-...

When I added the prompt: 'Could you use the specification given in Section 5.18 of https://docs.oasis-open.org/virtio/virtio/v1.3/csd01/virtio-...' it produced almost the same code preceded with some babbling from the document, but not using anything from the specification, not even the code fragments mentioned in the section.

cdrini

5 days ago

Ah darn. Yeah that is a tricky space, I'm curious to find if there's an incantation to make it actually do the right thing... But knowing next to nothing of your domain makes that impossible for me :P I will note ChatGPT isn't great at very long documents as input; Gemini Pro performs better there. Pulling out the specific section _might_ help, but that's getting into a-lot-of-work territory.

Alas, good luck!

fjfaase

5 days ago

I was just giving it as an example of how there are domains where AI assistance have little or no benefit. Even if you used them for searching for resources or instructions how to achieve something, they come up with nothing useful, or lead you in the direction you do not want to go. It is just like searching with Google: If you search for the thing that 90% of the people are searching for, your answer will be in the first page. If you search for something outside that, you have to skip many pages, or even not being able to find it at all.

One could also conclude that a large portion of software engineering is rather is mostly implementation of things that have been implemented many times before and that only a small portion consist of real software engineering where you have to develop code for a problem that nobody ever wrote before or that require a deep understanding of the problem domain.

cdrini

5 days ago

> One could also conclude that a large portion of software engineering is rather is mostly implementation of things that have been implemented many times before

Strong +1. That's one of the reasons I enjoy using LLMs, cause I'm often using it for the pieces of code which I don't even find rewarding to write; it's just glue code which I know has been written 1000 times over, which _I've_ already written 1000 times over across different projects.

ivanjermakov

5 days ago

I wonder if "lack of data" problem is solvable within LLM. Humans can apply general programming language knowledge when using a new language, but LLMs have to have lots of concrete source code to work well.

Today these models fall apart very quickly once you step out of js/python/c land to rust/zig/ocaml.

With today LLM tech, feeding it language documentation does not work great.

lawn

5 days ago

Not a very bright move from fly to publish this kind of condescending post on the company blog.

Making fun of people who care about copyright (implying that fly abuses GPL for instance) is just icing on the cake.

gigel82

6 days ago

What this boils down to is an argument for slop. Yeah, who cares about the quality, the mediocrity, the craft... get the slop, push it in, call it done. It mostly works in the golden path, it's about 6 or 7 orders of magnitude slower than hand-written software but that's ok, just buy more AWS resources, bill the client, whatever.

I can maybe even see that point in some niches, like outsourcing or contracting where you really can't be bothered to care about what you leave behind after the contract is done but holy shit, this is how we end up with slow and buggy crap that no one can maintain.

pixl97

6 days ago

>Yeah, who cares about the quality, the mediocrity, the craft..

Just about no-one in the F100 unless they are on very special teams.

If you care about the craft you're pushed out for some that drops out 10x LOC a day because your management has no ability to measure what good software is. Extra bonus points for including 4GB of node_modules in your application.

trinix912

6 days ago

It's not much different without the AI. Managers don't care about efficient code, they care about code that meets the business goals - whether that's good or bad is debatable. Agencies duct-taping together throwaway code isn't new. The classic "just buy more AWS resources" & such have been around for quite a while.

palata

6 days ago

Without AI, software is getting worse every year. My question is: will AI resolves that problem, or will it make it worse?

My feeling is that AI will keep the software industry in the same trend: produce more software of lower quality while being more profitable.

People sometimes say "perfect is the enemy of good". But this implies that putting less effort is "good enough". I believe it goes the other way: why would a company put efforts into selling you something good, if it can make you pay the same for something worse? It's all about maximising profit, and AI is no exception there.

atakan_gurkan

5 days ago

"If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other."

This is in fact what I am doing (though I am not trying to convince anyone not to use LLMs). How can I get into this other mode? Any pointers? Do I need to pay up for a service? I do not want to carry out vibe coding, I want to use LLMs to solve physics, math and logic problems/puzzles. I'd appreciate any recommendations.

baduiux

5 days ago

Just have a look at Cursor.ai or zed.dev. Cursor is based on VS Code and integrates different models right into your codebase. It has access to your codebase, knows your code (after indexing) and is able to directly apply changes to your code or run commands on your integrated terminal. You can directly accept/decline changes. You can reference existing code / files. You can link documentation, you can write a default prompt and default settings (previously .cursorrules).

It's faster than opening chatgpt in browser, writing your promt, waiting for the reply, scanning and scrolling through the reply, copying it, inserting it manually, going back to say something doesn't work.

wat10000

6 days ago

I wonder if it comes down to there being many kinds of programming and this stuff is good for some but not all.

The author describes a world where most code is boring grunt work and most collaborators are mediocre. This world is not familiar to me. I can see LLMs being useful in such a world, but it’s not where I work.

The comment about buying a table instead of building it really stuck with me. We’re the ones who design the table! Saying we should buy one from the store is like saying farmers should buy their food from the store instead of putting in all that work to grow it.

Ensorceled

5 days ago

I've been using LLM's pretty regularly for just over a year (my PyCharm AI license just renewed). My experience before that was pretty mixed and fell into the "sort of worth it" category.

It's improved to the point where the summary that I would email a junior developer is a pretty good prompt that produces code pretty close to want I want.

I had a weird project recently where I needed a stand alone event handler that would send an email receipt to the customer, another email to customer service, log the event in an accounting system via an API. I wrote up a proposal for the rest of the management team and, once accepted, pasted the proposal into ChatGPT (and a few others, but ChatGPT gave the best results).

I basically got a full recipe for setting up AWS API Gateway, Lambda, and SES, the Python Lambda function with api calls to SES, suggestions on using AWS SAM, etc. etc. Tweaked the prompt a couple of times and then took over from there. I had this into the hands of the testing team in less than a day.

I'm pretty sure this would reduce my team size in ANY role I've had in my career. Maybe not halve the teams, but it would eliminate a lot of the junior or journeymen developers.

kerryritter

6 days ago

A well articulated blog, imo. Touches on all the points I see argued about on LinkedIn all the time.

I think leveling things out at the beginning is important. For instance, I recently talked to a senior engineer who said "using AI to write programming is so useless", but then said they'd never heard of Cursor. Which is fine - but I so often see strong vocal stances against using AI tools but then referring to early Copilot days or just ChatGPT as their experience, and the game has changed so much since then.

Cloudef

4 days ago

LLM fanatics throw me nuts, instead of learned helplessness, at least try to understand what the token predictor outputs and be honest when you used LLM to generate your code dump when asking for help. Otherwise you are just wasting time and being simply a middle man for a machine and person who expects an interaction with actual human being with their valuable time.

martythemaniak

6 days ago

At 0 temperature an LLM is a Map<String,String> - a string input (key) will give you the same string output (value) every time. Hypothesis: there exists a key whose value is a complete, working, fully-tested application which meets your requirements 100% and fulfills your business need. This key is the smallest, most complete description of what your application does. It is written in natural language and represents a significantly compressed version of your application code.

My part-time obsession over the last few months has been trying to demonstrate this and come up with a method for finding these magic keys (I even tried to get the LLMs to search for me, lol). What I really want is to give the latest thinking models (200k input, 100k output) a 5-6 page design doc (4k words, 5k tokens) and have them produce a complete 5kloc (50k tokens) microservice, which would show a 10x compression. It's hard, but I haven't seen any reason to think it wouldn't work.

For better or worse, I think this will be close to what IC jobs will be like in few years. Fundamentally, our jobs are to try work with other functions to agree to some system that needs to exist, then we talk to the computers to actually implement this. If we switch kotlin+compiler for design doc+llm, it still going to be somewhat the same, but far more productive. Agents and such are somewhat of a stop-gap measure, you don't want people giving tasks to machines, you want to accurately describe some idea and then let the computers make it work. You can change your description and they can also figure out their own tasks to evolve the implementation.

01HNNWZ0MV43FF

6 days ago

Damn. Well I'll spend a few bucks trying it out and I'll ask my employer if they're okay with me using agents on company time, but

But I'm not thrilled about centralized, paid tools. I came into software during a huge FOSS boom. Like a huge do it yourself, host it yourself, Publish Own Site, Syndicate Elsewhere, all the power to all the people, borderline anarchist communist boom.

I don't want it to be like other industries where you have to buy a dog shit EMR and buy a dog shit CAD license and buy a dog shit tax prep license.

Maybe I lived through the whale fall and Moloch is catching us. I just don't like it. I rage against dying lights as a hobby.

jay_kyburz

6 days ago

Yeah, I'm ready to jump in, but I need an agent running on my hardware at home without internet access.

How far away are we from that? How many RYX 50s do I need?

This is a serious question btw.

pixl97

6 days ago

It's unfortunate that AMD isn't in on the AI stuff, because they are releasing a 96GB card ($10k so it's pricey currently) which would drop the number you need.

guywithahat

6 days ago

I mean it depends on the model; some people running deepseek report they have better performance at home running on a CPU with lots of ram (think a few hundred gigabytes). Even when running locally vram is more relevant than the performance of the GPU. That said I'm really not the person to ask about this, as I don't have AI agents running amuck on my machine yet

renjimen

6 days ago

You can self host an open-weights LLM. Some of the AI-powered IDEs are open source. It does take a little more work than just using VSCode + Copilot, but that's always been the case for FOSS.

Philpax

6 days ago

An important note is that the models you can host at home (e.g. without buying ten(s of) thousand dollar rigs) won't be as effective as the proprietary models. A realistic size limit is around 32 billion parameters with quantisation, which will fit on a 24GB GPU or a sufficiently large MBP. These models are roughly on par with the original GPT-4 - that is, they will generate snippets, but they won't pull off the magic that Claude in an agentic IDE can do. (There's the recent Devstral model, but that requires a specific harness, so I haven't tested it.)

DeepSeek-R1 is on par with frontier proprietary models, but requires a 8xH100 node to run efficiently. You can use extreme quantisation and CPU offloading to run it on an enthusiast build, but it will be closer to seconds-per-token territory.

1970-01-01

6 days ago

Everything works right until it doesn't. LLMs are trained on things that have worked. Let's revisit in 2027 when things are insanely faster, but not much better.

zombot

5 days ago

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

And who the fuck will train those interns, now they've been replaced by machines?

moron4hire

5 days ago

I think I figured out a big problem I have with AI, and that is that I just hate using other people's software. The vast majority of software is just garbage. It all tries to take way too much of my attention. It doesn't help that all these AI tools are new, so they are inextricably linked to some SV startup that wants to be the next unicorn, so they have some bullshit account-generation/data-retention/terms-of-service/acceptable-use-policy/remarketing-email-campaing song and dance that is all stuff I have to pay attention to. All I really want to do is sit down and think on my own. I'm constantly inundated with distractions at work, so when I get a few minutes to myself, I sure as hell don't want an AI bot messaging me about reviewing its PR. I don't even want to review the human-written PRs; just push your code already, if it breaks you get to fix it. I certainly don't want a 500th unread email in my inbox.

You know what doesn't constantly nag for my attention? Dead text files on disk. They just sit there, waiting for me to grace them with my presence. That's the way it should be. Computers serve me, not the other way around.

dev_l1x_be

5 days ago

So the reaso we should not use Rust (a memory safe language that gets rid of categories of bugs) is because some new AI tools are not read?

Using AI for over a year now on a daily basis, most AI tools do not struggle with Rust, they struggle to provide value. If I do not instruct them to keep the output to the bare minimum, I usually end up with 10x the output what would be a simple solution to my problem.

quasimodo_code

5 days ago

I think for most people that do not get value out of AI assisted coding, its a skill or learning issue. I don't care how good of a developer you consider yourself to be. If in 2025, these tools do not provide you significant (to almost absurd) value, you just do not know how to use them, or are to stubborn to push past the initial learning curve. Alternatively, maybe... programming is changing so much with AI, that you just might not see what's happening right infront of your eyes because its so foreign to all the other paradigms in the past.

In 2025 I am about 800-900 hours deep into using Cursor, blowing through 500 credits each 2 days (500 credits = 20$). I also organize a local "vibe code" workshop and meetup, with people from all sorts of experience levels, from no coding experience, to twice my senior. And what I am seeing is that what separates those from those who are good at using AI and those who are not, generally is a level of rigid stubbornness.

I can't wait for time to do its course here and those people who now are "all nuts" to slowly see programming of the past become an old relic.

kortilla

5 days ago

The premise of “lol, I’m just a dumb developer and just wait until the smart developers catch on” falls extremely flat coming from Thomas.

It kinda invalidates the entire premise because it builds on this false foundation of “look how useful it is to a mere plebeian” and the skills actually required to sift through LLM garbage are what puts you in the top 5% of programmers.

Most senior developers cannot conduct useful code review and spot errors. In every company I’ve been at, there were only 1 in 20 people that would give the useful in-depth reviews required to pick up LLM bullshit.

This style of engagement bait is brutal and it severely overshadows other nuggets of wisdom of how LLMs can be useful.

baxtr

5 days ago

There’s a fine line between being an AGI skeptic and an AI skeptic.

While I don’t believe the “AI will take all jobs” VC tales, I also think not using LLMs in software development workflows is similar to not wanting to use Excel as an accountant in the 1990s.

AI won’t replace all jobs, but companies that cleverly integrate it into their processes will be faster and more efficient compared to those that don’t.

fnordpiglet

6 days ago

FWIW with proper MDC/ rules I’ve found LLM programming agents excellent at rust. There’s a lot of complex and tedious minutiae in rust that I know but forget to apply everywhere it’s helpful while a SOTA LLM agent does well, especially with proper rule guidance to remember to use it.

Generally though I find LLMs have a pretty rapidly diminishing return on what you can expect out of them. They’re like a 3-5 year senior programmer that has really learned their domain well, but doesn’t have the judgement of a principal engineer. You get to a point where you need to reach in and right things and really pay attention, and at that point the diminishing returns set it rapidly and you’re better off just doing the rest yourself. Refactors and stuff can be delegated but that’s about it.

I find this true regardless of the language. None the less, I’ve been able to improve my overall velocity dramatically completing several projects in the last few months in the span of one typically. If tooling improves I hope to continue that but I’m already getting close to the limit of how fast I can conceive of useful creative things.

shaky-carrousel

5 days ago

It is going to be so much fun when all AI companies suddenly double prices, because what are you going to do, hire juniors?

kotaKat

5 days ago

Nah, I’m not nuts, you’re just terminally brain damaged by the amount of ketamine you’ve huffed thinking AI was such a great tool.

dspillett

5 days ago

> but the craft - Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.

That is not my objection, though some seem to think it is.

I got into coding because I found it interesting and fun. I've avoided managing others because I hate that sort of thing. Working with an LLM, prompting it to do work then assessing that work and adjusting the prompts or cleaning/completing the output myself, is far too much like outsourcing work to a 3rd party junior, and I don't want to do that whether said junior is a human in a far off land (or just down the street, for all I know!) or some LLM run by MegaCorp-ᴙ-Us.

I was already seriously disliking much of the modern programming industry (remote teams, the shit-show that client-side web work continues to devolve into, and much else) anyway. If it gets to the point where I can't do my job (or can't do it competitively enough) without using LLMs on top of dealing with the other things I don't care for, I'll be retraining for something else entirely. Heck, I've got the mortgage paid, I don't have or plan to have a family beyond the cats, I could just about survive on shelf-stacking…

I know it isn't a passing fad, though I'm convinced the current and near-future iterations aren't going to be quite as significant in the longer term as many believe. I don't object to others using it (aside from the matter of “MegaCorp-ᴙ-Us”s of the world suddenly deciding that piracy is just fine because they stand to benefit from it, in fact are already benefitting from it, in this case). But I don't plan to be nagged into using it myself.

> Do it on your own time.

If/when that career shift happens, or I survive as I am long enough to retire, I will probably do exactly that. As it stands I spend more than enough of my life on tech in DayJob, my own time is increasingly filled with unrelated hobbies and people.

keeda

5 days ago

A lot of the doubters are basically saying: "If it's so great, show me proof." While that is usually the right stance, I think it is short-sighted in times of revolutionary change.

To be sure, you definitely should not blindly trust the people who have a stake in selling AI products. However, you probably should trust the people who have no stake in selling AI, but are using it to accelerate themselves. The problem is, these people are probably too busy building things to spare any time convincing you. And in fact, it is a competitive advantage for them if others don't figure it out.

Here's my take: Everyone needs to figure out how AI works for them. The thing with AI is that it is a technology that behaves like people. It has the speed, efficiency and scale of computers but the fallibility and quirks of humans. This is why it's so confusing to discuss, and why folks have such varying experiences. The right approach to working with AI is like working with people: understand what they're good at and where they are weak, and then work with them to achieve your goals.

This will require using it "in anger" on many non-trivial tasks over a significant period of time, which will take persistence and patience. Yes, the hype is so over-the-top that you might not want to invest your valuable time on it. But you owe it to yourself to ensure you can fully capitalize on the sea change that is happening.

If it helps, I (like the OP) have no stake in selling AI and I posted this comment about my experiences on the other thread about the AI coding -- https://news.ycombinator.com/item?id=44163543 -- The upshot is, AI has enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. And I've not even used agents yet!

Nevermark

5 days ago

> Everyone needs to figure out how AI works for them.

That is the real takeaway.

I use LLM’s for dozens of things. It took time to find how model strengths best connect with mine.

People who only evaluate models for one purpose over short timespans are going to have a hard time.

It’s like models come from a different cognitive & communication culture. Eliminating that barrier takes more experimentation and experience than logic.

I think their creativity is also often a function of the prompters creativity.

The harder you (creatively) push them into a creative corner, the more genuinely creative they get.

Not unlike how a good teacher can get unexpected positive results from students when they are channeled and pushed.

jongjong

6 days ago

I believe that AI is very useful in software development but I don't buy the narrative that AI is responsible for layoffs over the past few years (at least not most of them). I think that narrative is a convenient cover for systemic misallocation which created a need to contain inflation. I think big tech execs understood that, beyond increasing their company stock prices, they also need to work together to keep the monetary system itself under control. This is why they've been firing people whilst having record profits. They've reached such scale and the system has reached such fragility that they have to think and act like economists to keep the thing going. The economy itself has become the responsibility of big tech.

But who knows, maybe AI will accelerate so rapidly that it will fix the economy. Maybe we'll have robots everywhere doing all the work. But I worry about the lack of market incentives for people to adapt AI to real world use cases.

For example, I'm an open source developer who likes to tinker but I've been booted out of the opportunity economy. I can't afford to program robots. People like me are too busy using AI to parse spreadsheets and send targeted ads to even think about automating stuff. We work for companies and have no autonomy in the markets.

If things had worked out differently for me, I'd probably own a farm now and I'd be programming robots to do my harvest and selling the robots or licensing the schematics (or maybe I'd have made them open source, if open source had worked out so well for me). I don't have access to such opportunity unfortunately. The developers who worked for big tech are good at politics but often disconnected from value-creation. Few of them have the skills or interest to do the work that needs to be done now... They will just continue leveraging system flaws to make money, so long as those flaws exist.

strogonoff

5 days ago

If yesterday we were sued for singing the happy birthday song in public, and today we see corporations with market caps the size of countries pretending that IP ownership is not much of a thing, the alarm bells should be deafening. Control over intellectual property has given us fantastic things—Linux, Blender, etc.—and you can’t have GPL without an ability to defend copyleft. IP laws provide that ability.

One does not need to be a skeptic about machine learning and its potential as technology to refuse to engage with its practical applications when they are clearly ethically corrupt. The ends do not justify the means. Just like refusing to buy products of forced labour or disproportionate environmental impact, or to invest in war and bloodshed, everyone makes this decision for themselves.

patcon

5 days ago

I so strongly agree with this.

I've never been a completionist. I'm distracted. I like novelty. I make half-form experiments, but mostly provide deeply immersive conversations for the benefit of a small group of people around me, some of whom go on to make/do things that I proudly feel my influence on. The final 10% of work often defeated me.

Now, I just make it happen. The pain is gone. Something else can carry the last 10%, and my ideas get realized by me (and AI).

I can't even conceive how this will change the world en masse. Knowledge domains (not just tech, but science, art, etc) have been built of the things that the type-A completionists knew how to see. The others didn't finish, and what they saw of the world has largely been marginalized.

The mind of ADHD and creative types is about to get much more represented in the output products of civilisation. This is going to be a big recalibration, and I think a re-balancing toward something more healthy tbh

averageRoyalty

5 days ago

I feel silly asking, but is this reality or hyperbole?:

> An agent can occupy itself for hours putzing with your tests in a VM and come back later with a PR.

> Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.

I'm working better and harder with Roocode in Code mode (and somtimes Architect). I've only been doing it for a few weeks, and it reads to me like I'm already using an 'old approach'.

mjburgess

6 days ago

Can we get a video of a workday conducted by these people?

Unless there's a significant sense of what people are working on, and how LLMs are helping -- there's no point engaging -- there's no detail here.

Sure, if your job is to turn out tweaks to a wordpress theme, presumably that's now 10x faster. If its to work on a new in-house electric motor in C for some machine, presumably that's almost entirely unaffected.

No doubt junior web programmers working on a task backlog, specifically designed for being easy for juniors, are loving LLMs.

I use LLMs all the time, but each non-trivial programming project that has to move out of draft-stage needs rewriting. In several cases, to such a degree that the LLM was a net impediment.

jsnell

6 days ago

Not exactly what you're asking for, but https://news.ycombinator.com/item?id=44159166 from today is not a junior web programmer working through the backlog, and the commit history contains all the prompts.

mjburgess

6 days ago

Sure, thanks. I mean it's a typescript OAuth library, so perhaps we might say mid-level web programmer developing a library from scratch with excellent pre-existing references, and with a known good reference API to hand. I'd also count that as a good use case for an LLM.

citizenpaul

5 days ago

I watched someone do a similar demonstration( https://news.ycombinator.com/item?id=44159166)live at an event. They ended up doing something like 3 pull requests to get the original change. Then had to do 4 more to get it to fix and put back things it removed. Not exactly efficient, and it was painful to sit there and be like I could have had it done manually 20x by now while we painfully waited for the AI to do the changes.

I've never been able to get it to work reliably myself either.

The internet just tells me to prompt harder. Lots of "grind-set" mentality energy around AI if you ask me. Very little substance.

mgraczyk

6 days ago

I have done everything from architecture design for a DSP (Qualcomm), to training models that render photos on Pixel phones, to redoing Instagrams comments ranking system. I can't imaging doing anything without LLMs today, they would have made me much more productive at all of those things, whether it be Verilog, C++, python, ML, etc. I use them constantly now.

mjburgess

6 days ago

I use LLMs frequently also. But my point is, with respect to the scepticism from some engineers -- that we need to know what people are working on.

You list what look like quite greenfield projects, very self-contained, and very data science oriented. These are quite significantly uncharacteristic of software engineering in the large. They have nothing to do with interacting systems each with 100,000s lines of code.

Software engineers working on large systems (eg., many micro-services, data integration layers, etc.) are working on very different problems. Debugging a microservice system isn't something an LLM can do -- it has no ability, e.g., to trace a request through various apis from, eg., a front-end into a backend layer, into some db, to be transfered to some other db etc.

This was all common enough stuff for software engineers 20 years ago, and was part of some of my first jobs.

A very large amount of this pollyanna-LLM view, which isnt by jnr software engineers, is by data scientists who are extremely unfamiliar with software engineering.

mgraczyk

6 days ago

Hmm how did you get that from what I listed?

Every codebase I listed was over 10 years old and had millions of lines of code. Instagram is probably the world's largest and most used python codebase, and the camera software I worked on was 13 years old and had millions of lines of c++ and Java. I haven't worked on many self contained things in my career.

LLMs can help with these things if you know how to use them.

magicalist

6 days ago

Tbf, you're also the CTO of a startup selling AI tools and saying in a nonspecific way that you're sure LLMs would have been helpful on large code bases you worked on years ago. Maybe so, but not at all what they were asking for in the root comment

mjburgess

6 days ago

OK, great. All I'm saying is until we really have videos (or equivalent empirical analysis) of these use cases, it's hard to assess these claims.

Jobs comprise different tasks, some more amenable to LLMs than others. My view is that where scepticism exists amongst professional senior engineers, its probably well-founded and grounded in the kinds of tasks that they are engaged with.

I'd imagine everyone in the debate is using LLMs to some degree; and that it's mostly about what productivity factor we imagine exists.

johnb231

5 days ago

Developers who are productive with these tools are not going to waste time on that.

They are busy doing their work and prefer their competitors (other developers) to not use these tools.

sarchertech

5 days ago

The article you’re replying to is just one of many examples of people who profess to be productive with these tools, but are spending significant time and energy attempting to convince skeptics to use them.

johnb231

5 days ago

For this article he took the time explain how it makes him more productive. What more do you expect? Why are you entitled to any more than that?

sarchertech

5 days ago

I’m not entitled to anything but what you said in the previous post is clearly not the least bit true.

johnb231

5 days ago

You are missing the point. The person I replied to is complaining about the lack of empirical analysis. There is no empirical analysis in the article. Subjective blog articles are not scientific studies.

sarchertech

5 days ago

Someone makes a blog post with the sole purpose of convincing AI skeptics to use AI and wants so badly to convince people to use AI that he even resorts to calling AI skeptics insane.

Someone else responds that video of the author actually using the tools would be more convincing.

Then you respond with essentially “no one wants to convince you and they’re too busy to try”.

Now if you misspoke and you’d like to change what you said originally to “many AI users do want to convince AI skeptics to use AI, but they only have enough time to write blog posts not publish any more convincing evidence”, then sure that could be the case.

But that ain’t what you said. And there’s no way to interpret what you said that way.

CraigJPerry

6 days ago

> it has no ability, e.g., to trace a request through various apis

That's more a function of your tooling more than of your LLM. If you provide your LLM with tool use facilities to do that querying, i don't see the reason why it can't go off and perform that investigation - but i haven't tried it yet, off the back of this comment though, it's now high on my todo list. I'm curious.

TFA covers a similar case:

>> But I’ve been first responder on an incident and fed 4o — not o4-mini, 4o — log transcripts, and watched it in seconds spot LVM metadata corruption issues on a host we’ve been complaining about for months. Am I better than an LLM agent at interrogating OpenSearch logs and Honeycomb traces? No. No, I am not.

mjburgess

6 days ago

Great, let's see it. If it works, it works.

For the first 10 years of my career I was a contractor walking into national and multinational orgs with large existing codebases, working within pre-existing systems not merely "codebases". Both hardware systems (e.g., new 4g networking devices just as they were released) and distributed software systems.

I can think of many daily tasks I had across these roles that would not be very significantly speed-up by an LLM. I can also see that there's a few that would be. I also shudder to think what time would be wasted by me trying to learn 4g networking from LLM summarisation of new docs; and spending as much time working from improperly summarised code (etc.).

I don't think snr software engineers are so scepticial here that they're saying LLMs are not, locally, helpful to their jobs. The issue is how local this help seems to be.

mgraczyk

6 days ago

I worked on debugging modem software at Qualcomm in 2011, also prerelease 4G networking. I believe that LLMs would have dramatically improved my productivity across nearly all tasks involved (if they would allow me to use an LLM from inside the faraday cage).

ryukoposting

6 days ago

I write embedded firmware for wireless mesh networks and satcom. Blend of Rust and C.

I spent ~4 months using Copilot last year for hobby projects, and it was a pretty disappointing experience. At its best, it was IntelliSense but slower. At its worst, it was trying to inject 30 lines of useless BS.

I only realized there was an "agent" in VS Code because they hijacked my ctrl+i shortcut in a recent update. You can't point it at a private API without doing some GitHub org-level nonsense. As far as my job is concerned, it's a non-feature until you can point it your own API without jumping through hoops.

ramraj07

6 days ago

You used one AI tool that was never more than autocomplete a year ago and you think you have a full hold of all that AI offers today? That's like reviewing thai food when you've only had Chinese food.

ryukoposting

6 days ago

>you think you have a full hold of all that AI offers today?

I absolutely don't, and I'd love if you could highlight a spot where I suggested I was. As I said, the problem isn't that I don't want to try using an agent, the problem is that I can't because one incredibly basic feature is missing from VS Code's agent thing.

I'll occasionally use chatbots, mostly for spitballing non-professional stuff. They seem to do well with ideation questions like "I'm making X, what are some approaches I could take to do Y?" In other words, I've found that they're good at bullshitting and making lists. I like R1-1776, but that's only because Perplexity Playground seems less restricted than some of the other chatbots.

It's also nice for generating some boilerplate bash stuff, when I need that kind of thing. I don't need that very often, though.

frutiger

5 days ago

> That's like reviewing thai food when you've only had Chinese food.

What's the implication here? That Thai food was invented 1 year after Chinese food?

manmal

6 days ago

Here's a 3+h video of the PSPDFKit (Nutrient) founder vibe-coding a Mac app. Can be watched at 2x: https://steipete.me/posts/2025/the-future-of-vibe-coding?utm...

greybox

6 days ago

> All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument.

Yes we are, because the kind of work you need to do in C++ or Rust is probably entirely different from the work this person manages to get the LLM to do in Go.

jjcm

6 days ago

The most important thing in this article in my mind is in the level setting section - if you are basing your perspective on the state of AI from when you tested it 6mo+ ago, your perspective is likely not based on the current reality.

This is kind of a first though for any kind of technology. The speed of development and change here is unreal. Never before has a couple months of not being on top of things led to you being considered "out of date" on a tool. The problem is that this kind of speed requires not just context, but a cultural shift on the speed of updating that context. Humanity just isn't equipped to handle this rate of change.

Historically in tech, we'd often scoff at the lifecycle of other industries - Airlines haven't changed their software in 20 years?? Preposterous! For the vast majority of us though, we're the other industry now.

Arainach

6 days ago

>pull in arbitrary code from the tree, or from other trees online, into their context windows, run standard Unix tools to navigate the tree and extract information, interact with Git, run existing tooling, like linters, formatters, and model checkers, and make essentially arbitrary tool calls (that you set up) through MCP.

....for the vast majority of my career, anyone who suggested doing this - much less letting code that no one in the world (much less the company) truly understands the logic flow of do this - would be fired.

kazinator

5 days ago

> You’ve always been responsible for what you merge to main.

I've never merged anything with an unclear copyright to main, let alone if I positively suspected that it had been plagiarized.

> Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

Every place I've been employed was nearly paranoid about copyright, with rules in place about never copying and pasting anything copyrighted into the company code base.

All the major open source distros and projects, likewise.

There are some bad apples out there, like, oh, Sam Altman, who wants massive plagiarism to be fair use.

So, au contraire, you can shove your contemptuous blog up your ass, buddy.

bananamerica

4 days ago

I know this is a website for programmers and I am not one. I like LLMs for quick answers and simple tasks like proofreading (just to fix errors, not to add or change anything) and simple text formating. I absolutely hate for everything else. I am a writer and LLMs prose may be correct and sometimes whimsical but it has soulless quality to it that makes me forget about what I just read after five seconds. My friends use Meta AI on our chat groups on WhatsAppp and I hate it so much, I delete every single AI message as soon as I see it.

kopecs

6 days ago

Seriously? Is this argument in all earnestly "No profession has been more contemptuous therefore we should keep on keeping on"? Should we as an industry not bother to try and improve our ethics? Why don't we all just make munitions for a living and wash our hands of guilt because "the industry was always like this".

Seems a bit ironic against the backdrop of <https://news.ycombinator.com/user?id=tptacek>:

(although perhaps this is tongue-in-cheek given the last year)

panorama

6 days ago

It's a fib sequence

beej71

6 days ago

> I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.

I'm glad I got out when I did. :) This simply isn't something I want to do for a living. I'm glad there are people who do, though.

alex_suzuki

5 days ago

Judging from your username, you are 10 years older than I am. What are you doing now? Early retirement? Or career change?

beej71

5 days ago

Went from 20 years of dev to teaching CS at a university--for as long as that lasts. Call it "semi-retirement". :)

After that, who knows. Maybe the whole system collapses and I just go to work as a tutor for some rich peoples' kids in exchange for loaves of bread.

pmdrpg

6 days ago

This op ed suggests that it’s easier to audit a huge amount of code before merging it in than is to write the code from scratch. I don’t know about anyone else, but I generally find it easier to write exactly what I want than to mentally model what a huge volume of code I’ve never seen before will do?

(Especially if that code was spit out by an alien copypasta that is really good at sounding plausible with zero actual intelligence or intent?)

Like, if all I care about is: does it have enough unit tests and do they pass, then yeah I can audit that.

But if I was trying to solve truly novel problems like modeling proteins, optimizing travel routes, or new computer rendering techniques, I wouldn’t even know where to begin, it would take tons of arduous study to understand how the new project full of novel algorithms is going behave?

outlore

5 days ago

The author mentions that smart programmers kick off agents, then go for a coffee and wait to be notified.

But agents like Claude Code and Cursor run on a user’s machine right? So how can they work in the background like that? are there agents that can work async?

dismalaf

5 days ago

Fun article, although I'd say I'm maybe slightly more on the skeptical side.

I've found LLMs can't really write a whole program there isn't already a template for. They're good enough at writing individual functions, and as the author mentioned, their breadth of knowledge is better than any individual.

Honestly, I see them as a supercharged version of googling, without all the out of context answers you get from Google since they also place ads and whatnot. I find an LLM can get me an answer far quicker than a regular internet search, even if it's imperfect and I'm probably not going to put it straight into my code.

stego-tech

6 days ago

Finally, a proper good take from someone (seemingly) approaching it rationally and objectively, rather than blindly sitting on either side of the hype fence. LLMs are good tools, and even I, a decidedly-not-software-engineer, have been increasingly tinkering with locally-run tools to create applications and tooling that meets my own needs. The one thing that’s held me back from publishing has been the IPR angle, but I quite like this line on that topic:

> But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

The OP isn’t wrong. Some of the most brilliant minds in computer science have built tools to shatter DRM, enable widespread piracy, and repeatedly lift code wholesale from public repos if it helps them move a product forward. The modern crop of technologists (myself included) have long had a hypocritical view on IPR: “rights for me, not for thee” to oversimplify things. The entirety of the modern public cloud is built by trodding on the backs of open source projects while wearing lawn-aerating spikes. We steal shit from others so our Founders and Masters can reap billions in profit; to decry plagiarism now feels incredibly dishonest and insincere, at the very least.

Look, I’m an AI skeptic myself. On the whole, these tools are bad. They’re taking jobs, polluting the planet, and dumbing down a society that still can’t try rebooting something when it fails, let alone identify where their WiFi ends and the internet begins. They’re toxic walled gardens that can reauthor reality to the whims of the VCs and Billionaires funding them, locking you into subscriptions forever.

But even I will admit that modern LLMs, when it comes to writing code, are actually really good for the majority of C-tier devs and below. Are they as helpful to bleeding-edge engineers on the cusp of the future? Nah, not really, but if you’re just writing a checklist app for iOS to help you stay on task, or a Packer manifest to lay out some VM templates, they’re kinda awesome.

omot

6 days ago

I think of programming languages as an interface between humans and computers. If anything, the industry expanded because of this abstraction. Not everyone has to learn assembly to build cool shit. To me AI is the next step in this abstraction where you don't need to learn programming languages to potentially build cool projects. The hard part of software engineering is scale anyways. My bet is that this will expand the industry in unprecedented ways. Will there be contraction of traditional programming jobs? Absolutely. The growth in tech jobs over the last 20 years weren't more assembly programmers. They were abstraction experts. I'm sure the next wave will be even bigger, professional prompting will explode in size.

TheRoque

6 days ago

The C abstracting the assembly or the GC a abstracting away memory management work because they were possible to implement in a deterministic and reliable way (well, in the case of garbage collection, not all the time)

But I don't think that's a similar situation for LLMs, where the hallucinations or failure to debug their own issues are way too frequent to just "vibe code"

aprilthird2021

5 days ago

He forgot "but the 15 year old behemoth codebase that doesn't fit in any context window and isn't well documented and changing of which requires changing tons of ancillary files not easily connected to each other logically".

That's what a lot of the highest paid programmers work on. I work in such a firm and the agent they have for us is incapable. And it's the latest and greatest from whoever coders love to rave about online.

I love that AI written tests and other generally greenfield things work well and can be vetted quicker than writing and getting the code oneself. But for most more complex tasks it's not there. I'm ready when it is though

cratermoon

5 days ago

"If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that."

Thus sucking up all the time you "saved" by having the LLM write the code you saved. The machine made the easy parts of coding easier while making the hard parts harder.

"You’ll spend 5-10 minutes knocking it back into your own style."

As if style is the only thing that matters?

"LLMs are showing signs of adapting to local idiom, but we’re not there yet."

First step fallacy. Ladders are showing signs of being tall enough to reach the moon, but we're not there yet.

d--b

6 days ago

Yeah, and it’s progressing so fast. Singularity is definitely on the table.

Whoever says otherwise should read their own comments from 2 years ago and see how wrong they were about where AI is today.

Not saying singularity will happen for sure, but is it a possibility? Hell yeah.

suddenlybananas

6 days ago

It's not really that different than 2 years ago. Better but not qualitatively so.

yahoozoo

5 days ago

Tell me you don’t understand LLMs without telling me you don’t understand LLMs.

d--b

5 days ago

I definitely don’t understand LLMs. Kind of the point though.

yahoozoo

5 days ago

Will LLMs ever break the “you have to treat it like an intern/junior” ceiling?

devmor

5 days ago

For some reason, every hardcore AI proponent I hear from resorts to gaslighting and namecalling to convince everyone it's so useful for everything.

When that doesn't work, they start inventing hyper-specific scenarios and graphs with grounding that exists only in their imagination to push the topic further. If the AI can't solve a problem in a domain for you, well, it's that domain that's the problem, not AI. You simply need to re-invent the paradigm around your entire problem - easy stuff.

Finally, when they've failed to convince anyone with a modicum of rationality, they resort to some version of "well, fuck you, it's the future and I'll just be leaving you behind!"

Interestingly, this article has all three.

debrisapron

5 days ago

I want to skip all the debate and just have someone tell me what setup I need to get to here:

mrcwinn

6 days ago

>It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.

Are you kidding me? Mosaic was released in 1993. In 1998 Apple released the iMac on a theory, based on some marketing research, that people might be really interested in "getting online." Five years. And that's about a decade before we realized we might want a touch phone. More than a decade before an App Store.

This is very much still the top the first. AI will make even the internet seem small. Which makes sense — the entire whole internet is merely one input into this technology.

epolanski

5 days ago

LLMs are a tool, a very precious one if you know how to use it.

I've seen Antirez use it for hard to find bugs in the Redis codebase (videos are on YouTube, albeit Italian), I've seen Cursor refactor in a matter of minutes code that would've taken hours, correctly, I've seen security researchers leverage it to find complex edge cases, etc

My 2 cents is that developers that won't learn to use the tools will be left behind (albeit not for long imho), but also, that sales people promoting the idea that devs can be replaced equally dangerous.

abdullin

6 days ago

My current workflow with Codex is (coding environment from OpenAI):

(1) Ask to write an implementation plan for a specific change or a feature. It will go through the source code, look up references, make notes and produce a plan

(2) Review the plan. Point out missing things, or stuff that needs improvement.

(3) Once I'm satisfied with the plan - ask to draft PR. Launch a few attempts in parallel and pick the one that I like the most.

(4) While drafting PR, Codex will run unit tests (even can run E2E tests in its container), linting and type checkers at every single step. This helps a lot with the stability.

(5) I review the code and merge the PR if I like it. Ask to cleanup - if not.

This feels like working with a remote team - very patient and diligent at that.

Ultimately, I get to get more features done per day. But I also feel more tired by the end of the day due to a higher level of cognitive load. There are more decisions to make and less idle time (e.g. no more hours spent tidying up the code or doing relaxing and pretty refactoring).

TLDR; this AI thing works really well at least for me. But it comes with trade-offs that might slow down its adoption by companies en masse.

darepublic

6 days ago

I do use LLMs for coding and the newer models have definitely been a blessing. I don't know about using coding agents (or agentic coding) though. I personally do not find this better than chatting with the llm, getting the code back and then copy / pasting it and grokking / editing it. The author of this seems to suggest that.. there is one correct flow, his flow (which he doesn't entirely detail) and everything else is not appropriate. He doesn't go into what his process is when the LLM hallucinates either. Not all hallucinations show up in static analysis.

fossuser

6 days ago

Yes! It's amazing how even in a field that tends to lean more early adopter than average you still get a lot of the default knee-jerk dismissal and cynicism - even when it's something clearly amazing and useful as thinking machines.

We're in the middle of a major shift - there will benefits to those that adapt first. People outside the field have no idea what's coming, even those of us in the field are underestimating the shift.

There were a few outliers in the 60s who understood what the computing revolution meant and would mean, but most did not. This is likely an even bigger change than that.

holoduke

6 days ago

"If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing"

I am actually doing this the whole day long. For example i have setup today a fresh new debian vps for some interns. U had to provide them with a docker system, support for go, nginx stuff and i made a quick hello world app in angular with a go backend. I could have done it myself. But i asked chatgpt to provide me with all the commands and code. No idea how an agent could do this for me. I got everything running in like 30 minutes.

einpoklum

5 days ago

> LLMs can write a large fraction of all the tedious code you’ll ever need to write.

For most software projections, they cannot even remotely do that - at least not today and as far as I can tell.

And that is to say, it's not that they can write a not-so-large fraction. They cannot write any significant fraction of it. They can often write you a draft that helps you get there faster, though. They can also facilitate web search result consolidation etc. which can also help. But - they can also lead you down wrong paths and dead-ends, on occasion.

creativenolo

6 days ago

If you’re leaning out, spend two weeks leaning in.

I did, and learned a ton, and likely not going back to how I was before, or how I used it a week ago.

The comments in the article about not reading the agent is good but it’s more than that…

Vibe coding is for non-coders. Yet, you get a feel for the vibe of the AI. With windsurf, you have two or three files open, and working in one. It starts smashing out the multi, interspersed, line edits and you know with a flutter of your eyes, it’s got your vibe and correctly predicted your next ten lines. And for a moment you forgive it for leading you astray when you read what it said.

jasonm23

4 days ago

Here's an idea... imagine you are just starting, you know nothing... an AI can and will help you.

Tell me you won't use it and explain properly, without appeal to "legacy" or "human values" or other abstracts which don't self validate.

Now note that this is POV for every child on earth.

The new discipline is provider of LLM/AI guard rails, and absolutely requires knowing what is/not correct.

tshaddox

5 days ago

I don't doubt that LLMs will have a large impact on software and programming. But I do wonder whether it will be a positive impact. I think one of the biggest problems the software industry has is that it's far too rare for the people building software to have deep explanatory knowledge about what they have built. LLMs threaten to make this problem vastly worse than it already is.

bdangubic

5 days ago

the reason this is rare is that no one stays on any one project long enough for this to be a reality anywhere. you know anyone working 15-20 years on the same project? I’ve been on many projects where deep explanatory knowledge existed through and through until the project was initial-code-complete and then people left to do other amazing things leaving project with people with limited knowledge of it. I suspect this is the case for vast majority of software

SpicyLemonZest

5 days ago

> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.

I think this is my key point of disagreement. Professional software developers often don't see themselves as artisans, because Steve Jobs's unseen feet are implemented through our implicit standards of ourselves. I don't go out of my way to be a perfectionist on the job, but neither will I try to push 1000 lines of spaghetti to solve a minor problem that seems like it ought to be a simple fix.

I've worked with people in the past who don't have this instinct, and it's my experience that their productivity is quite low if not negative, even though they often perceive themselves to be 10x developers who can handle all kinds of edge cases super fast. I suspect that most AI-generated code today ends up in this niche.

bsder

6 days ago

Here's what to do: Show me a video of LLM fixing four filed issues in the KiCad codebase.

If you do that, I'll swallow my AI skepticism.

I would love to have an LLM that I can turn loose on an unfamiliar codebase that I can ask questions of. I would love to have an LLM that will fill in my Vulkan boilerplate. etc.

I use emacs and Mercurial. You can demonstrate magic to me and I can be convinced even if it's not mainstream.

Rewriting Javascript slop to StackOverflow standards is not convincing me.

Get to it.

(The OAuth stuff posted earlier certainly moved my needle, but the fact that they needed a gaggle of reviewers as well as hand holding when the LLM got stuck mutes the impact significantly.)

supermatt

5 days ago

i am no clearer on what setup the author is using? are you hand coding all these non-code agents you mention? You talk about zed, then about cursor - don't they both do the same thing, or are you using 2 IDEs to write your code? What tooling is used for these async agents that are doing things while i eat my breakfast?

I really want to like AI assisted coding, but I guess i am just stuck as to what to do.

I have tried co-pilot, both 2 years ago, and recently. I have tried some other AI plugin on VScode (i thought that was cursor, but I don't see it in the extensions when i now search, so it was probably something else). If anything, I have found with these solutions that they are great for the first few tasks - generate these models, scaffold this, add oauth, etc - but after that they lack all consistency with the code they themselves generated (like if i add a new model, the output is wildly different to what it already created) and it takes more time to rewrite that output than to copy-paste and change whats already there.

@op - could you explain exactly what tooling you are using, maybe give an example of these async agents, etc? Would be most appreciated for us luddites who would love to be more productive.

femiagbabiaka

5 days ago

I agree wholeheartedly with this piece, but I think it might've been taken better if it wasn't written in the style of an HN comment. That being said, definitely doesn't deserve so much flaming in the comments, sorry to see that, because it's basically correct, with some small things I disagree with here and there. Feels exciting to be on the precipice of having a Star Trek style spaceship computer at my fingertips.

meroes

6 days ago

> LLMs can write a large fraction of all the tedious code you’ll ever need to write.

But, you still have to read it:

> Reading other people’s code is part of the job...I have to read the code line-by-line anyways.

So instead of writing the tedious code, I only have to read it. Oh but don't worry, I don't have to read it too carefully because:

> Agents lint. They compile and run tests. If their LLM invents a new function signature, the agent sees the error

But remember...

> You’ve always been responsible for what you merge to main.

So now I have to oversee this web of agents and AI ontop of coding? Am I doing more now for the same pay? Am I just speedrunning myself toward lower pay? Is AI adoption a prisoner's dilemma toward lowing my wages hardest?

Because is good at coding compared to many other disciplines (e.g. math), it makes the internal AI politics among programmers more of an issue. Add fuel to that fire baby!

aabajian

5 days ago

I've observed that AI excels at creative work, but not exacting sciences. For example, If I say, "Write the code to put the button in the upper right corner," I'll almost certainly get a button in the upper right.

Conversely, if I say, "Put the button in the upper right corner exactly like the existing button in the upper left," I'll get a button in the upper right, but not exactly like the upper left button.

tstrimple

5 days ago

This is the same as every other transformative tool which has come to market. You've got a certain segment of the population who will embrace it and push the boundaries of what is possible with the current tools and those in the future. And you've got the curmudgeons who won't let go of the one tool and one non-cloud hosting solution they so desperately cling to and deny the utility of anything else.

tedious-coder

6 days ago

AI makes me sad. When I started my CS degree, I didn't even know what silicon valley was. I was unaware of what the SWE job landscape was like. I went to school in a no-name town.

Computer science was an immensely fun subject to learn. I moved to one of the big cities and was bewildered with how much there was to learn, and loved every second of it. I gradually became good enough to help anyone with almost anything, and spent lots of my free time digging deeper and learning.

I liked CS and programming - but I did not like products built by the companies where I was good enough to be employed. These were just unfortunate annoyances that allowed me to work close enough to what I actually enjoyed, which was just code, and the computer.

Before LLMs, those like me could find a place within most companies - the person you don't go to for fast features, but for weird bugs or other things that the more product-minded people weren't interested in. There was still, however, an uncomfortable tension. And now that tension is even greater. I do not use an LLM to write all my code, because I enjoy doing things myself. If I do not have that joy, then it will be immensely difficult for me to continue the career I have already invested so much time in. If I could go back in time and choose another field I would - but since that's not possible, I don't understand why it's so hard for people to have empathy for people like me. I would never have gone down this path if I knew that one day, my hard-earned-knowledge would become so much less valuable, and I'd be forced to delegate the only part of the job I enjoyed to the computer itself.

So Thomas, maybe your AI skeptic friends aren't nuts, they just have different priorities. I realize that my priorities are at odds for the companies I work for. I am just tightly gripping the last days that I can get by doing this job the way that I enjoy doing it.

simonw

6 days ago

I recommend reframing this.

LLMs don't make your hard-earned-knowledge less valuable: they make it more valuable.

You are better qualified to use them to build great software than people who don't have your level of experience and software engineering domain expertise.

If you don't want to do that then I guess you can find another career - but if you switch careers because you incorrectly think that LLMs make programming experience less valuable you would be making a big mistake in my opinion.

tedious-coder

6 days ago

I agree with your assessment of the value of the skills, at least for the time being. What I dislike is the way that we are being encouraged to work now. I simply do not find any joy, at all, in reviewing LLM-written code and then typing in the next prompt.

A sentiment I see often is that it's work, it's not supposed to be fun, and you work at the pleasure of the employer. And I accept that. But I still am really just crushingly sad that this is what my job is becoming.

In the article, Thomas wrote:

> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.

I see this as a massive downside, because I loved writing tedious code. I loved reading docs on something I previously didn't understand. I loved forming the mental models strong enough to say "yeah I see why that's there" in the previously-inscrutable APIs of the frameworks and such that I was using. It was precisely the _way_ that I approached that work that allowed for that knowledge to accrue. It was because I almost never just copy/pasted something without spending a lot of time to understand it.

I do some of the same with ChatGPT. I type the code in myself after trying to internalize the ChatGPT response. But even that is starting to feel like company time-theft, as the attitude is shifting even further away from "knowing how to do things is good" toward "getting shit done is all that matters."

metaltyphoon

5 days ago

> You are better qualified to use them to build great software than people who don't have your level of experience and software engineering domain expertise

Since a big majority of companies stopped hiring juniors, where is the new blood coming from when the inevitable more seniors retire?

Karrot_Kream

6 days ago

I think the important thing here is you're being honest about how you're feeling. You bring up a very real anxiety and possibility and even folks who are using LLMs probably feel some degree of this alienation. That LLMs are yet another tool to push us to move as fast as possible rather than have our brains get into the nooks and crannies of hard problems that may take longer but are more rewarding to us.

But again, you're being honest. The problem with a lot of the AI skeptic arguments I see is a lack of this honesty. Others have noted that there are a lot of contradictory skeptical arguments, and I suspect the contradictions come because the authors have negative emotions about AI which they're using to create negative arguments.

tedious-coder

5 days ago

I do fall into this category of people that are seen as heavily abusing copium. I can admit that when I do get unsatisfactory results from a prompt session, a lot of it has to do with the mental friction I feel at the idea of letting something else write my code.

It again is coming back to my opinion the LLMs have recreated the job in such a way that it emphasizes what I disliked most, and de-emphasizing what I liked. It emphasizes "the goal" and de-emphasizes "the process". We had a period in the 10's where the process (namely, becoming adept at using and learning an ever-changing set of open source tools) was a bit more celebrated. You could justify a lunch-and-learn on things like man pages, commit isolation levels, or package manager - and doing something like that would be seen in a positive light. And now, why would you waste everyone's time talking about something that ChatGPT can figure out for you?

Anyway, thanks for your time in your response.

Leo-thorne

5 days ago

At first I didn’t think LLMs were that impressive, but ever since I started using them, I’ve pretty much stopped writing for loops by hand. It’s not that I forgot how. I just don’t feel like doing it anymore. The time it saves me on boilerplate lets me focus on other things. Sometimes the code isn’t perfect and I need to fix it, but it’s still way easier than starting from scratch.

lomase

5 days ago

99% of the code I write today does not use for loops. I don't need an IA for that, just C# or any map function.

mouse_

6 days ago

> Extraordinarily talented people are doing work that LLMs already do better, out of spite.

So what, people should just stop doing any tasks that LLMs do subjectively better?

tptacek

6 days ago

I don't know the full answer to this question, but I have a partial answer: they should at least stop doing tedious tasks that LLMs do better.

p-o

6 days ago

Unrelated to your friends, but a big part of learning is to do tedious tasks. Maybe once you master a topic LLMs can be better, but for many folks out there, using LLMs as a shortcut can impede learning.

tptacek

6 days ago

I'm ~8,000 XP into MathAcademy right now, doing the calculus stuff I skipped by not going to college. I'm doing a lot, lot, lot of tedious practice. But I know why I'm doing it, and when I'm doing doing it, I'm going to go back to using SageMath to do actual work.

asadotzler

6 days ago

Some of us thrive in tedium, and also do it better than bots.

pixl97

6 days ago

And then sit in shock when you're replaced with an auto-loom.

cwoolfe

5 days ago

"Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?" Good point! Treat AI generated code as if somebody else had written it. It will need the same review, testing and refactoring as that.

barrell

5 days ago

This blog post makes me actively want to use something besides fly. Putting aside the massive condescension in the article…

… most of the code you write is tedious boilerplate? What do you do all day? I can honestly say the most tedious code I write is migrations, because I have to edit three files. I do this like once a week. It would take way longer to explain to an ai what to do than just do it myself.

… you run the ai on your procrastination and as a result do heads down work? Seriously? 1, programming with AI is like playing a slot machine. It keeps you hooked just on the verge of winning the jackpot for far longer than the tedious task would have distracted you. And let’s say a good agentic case - it runs for 30 minutes before it’s done. That’s not even enough time to “boot up” a serious problem, and your already having to put it aside to go check the work you yourself are saying is mediocre? And you’re advocating to run multiple of these? Even the most ardent vibe covers I know talk about babysitting the agents. How is this unlocking deep work time in any way shape or form?

… craft is relagated to hobbies? I guarantee you, this person loves Linear. It was probably written on a MacBook. This whole “we solve real problems” BS is just a cop out because being great is hard.

… mediocre code is good code? All code is a liability. If you generate 10 times the code, and are admitting it’s the lowest bar of quality possible, it is at a minimum 10x the liability on your shoulders.

… letting LLMs refactor your tests? Serious question, who actually does this? Because I would LOVE to have LLMs write tests for me. I have tried every. single. time. I need to write something that needs a lot of tests to ask LLMs for tests for TWO YEARS, and not once have I ever used a single test it generates. It generates worse tests than it does code, and the last thing I want is 42 asserts in 8 tests across 600 LoC that nobody understands that poorly tests what should have been 2 asserts in 2 tests in 12 lines of code.

Glyptodon

5 days ago

I think he's pretty much right, but it just leaves me managing an AI and hopefully having fewer offshore developers and faster turnaround. So something less than a 50% boost, and kind of variable, but still significant. I don't really think it's world shattering because most software is deeply constrained by business processes (deciding what to build, etc.) that this doesn't help.

leecarraher

5 days ago

It depends on what level of nuts you mean. Some are AGI skeptics about LLMs, theyre probably right, there is likely more breakthroughs required before true AGI. But AGI isn't required to completely disrupt a ton of good, well-paid professions. That is the more worrying scenario. AI is already widening the wealth gap irreparably and with more progress it will only continue.

energy123

5 days ago

> but it’s bad at rust

Is it really? I scraped all HN comments with the words "Rust" and "LLM" from the last 6 months, put it into Gemini and asked it to surface quotes of people saying they had a good time, versus quotes of the opposite. It was about 60% positive and 40% negative.

Does anyone have recent personal experience with this they can talk about?

nmeofthestate

5 days ago

This sounds very cool - almost science-fiction in parts. I want to know more details about the specific technology, whether it could handle a 30k C++ codebase, how much work you would need to do to make it learn how to run your tests, analyse their output, interface with all your internal software development services etc.

QuantumGood

4 days ago

For writing, I do a back-and-forth about what I'm trying to do with Claude before asking it to write anything. Produces much less "that's not exactly what I was looking for..."

weitzj

5 days ago

I was quite happy and entlohntes when I dropped an HTML table Formates with Emojies into my prompt and told to turn it into a terraform module.

But how do you quality check this part? I am proofreading all these transformations and I am not sure if this is really faster than actually recording a VIM macro, which will do the steps I need.

bowsamic

5 days ago

This article doesn’t address literally THE argument I have against LLM usage: that it will degrade major human skills like reading, writing, and yes, coding

But I’m at least grateful for LLMs making me more aware of how important it is for me to focus on my own skills building. I will never use it because I value having those skills sharp

stopachka

6 days ago

tptacek, curious question: what agent / stack do you currently use?

tptacek

6 days ago

I use Codex CLI for casual stuff, because of the ergonomics of just popping open another terminal tab.

I use Zed as my primary interface to "actually doing project work" LLM stuff, because it front-ends both OpenAI and Google/Gemini models, and because I really like the interface. I still write code in Emacs; Zed is kind of like the Github PR viewer for me.

I'm just starting to use Codex Web for asynchronous agents because I have a friend who swears by queueing up a dozen async prompts every morning and sifting through them in the afternoon. The idea of just brainstorming a bunch of shit --- I can imagine keeping focus and motivation going long enough to just rattle ideas off! --- and then making coffee while it all gets tried, is super appealing to me.

andersa

6 days ago

I'm curious how much you paid in the past month for API fees generated by these tools. Or at least what order of magnitude we're talking about.

tptacek

5 days ago

I don't know, like, weekly coffee shop tab levels of money?

andersa

5 days ago

Interesting, I guess the ones I saw complaining about blowing through hundreds worth of API calls per session were just doing something wrong?

tptacek

5 days ago

I don't know, it depends on what they were accomplishing. "Hundreds of dollars" (in expectance) is not a meaningful amount stood up against any significant amount of shipping code.

mkfs

5 days ago

> I use Zed as my primary interface to "actually doing project work" LLM stuff, because it front-ends both OpenAI and Google/Gemini models, and because I really like the interface. I still write code in Emacs; Zed is kind of like the Github PR viewer for me.

You're not concerned about OpenAI or Google stealing your code? I won't use VSCode for that reason, personally, but I do use VSCodium.

tptacek

5 days ago

I could not be less concerned with OpenAI or Google stealing my code.

mppm

5 days ago

That's an interesting sentiment, coming from you. Is privacy officially over then? Should I stop pushing my friends to use Signal over WhatsApp?

mrmansano

6 days ago

> I'm just starting to use Codex Web for asynchronous agents because I have a friend who swears by queueing up a dozen async prompts every morning and sifting through them in the afternoon

Bunch of async prompts for the same task? Or are you parallelizing solving different issues and just reviewing in the afternoon?

Sounds intriguing either way.

kasey_junk

5 days ago

I _think_ I’m the friend being referenced. I’m parallelizing solving different issues. Basically I keep an internal swim lane of a variety of projects and just kick off the next task in the lane I think the agent can handle.

Then I do my “real” work, there’s the stuff I don’t trust the agent with, or is more exploratory or whatever.

As I think of more agent tasks doing that I write them down. When I take a break, say for lunch or winding down at the end of the day I check back in on previous tasks and fire off the new ones.

My flow is very similar to what I did with junior eng except I’m willing to fire off even more trivial tasks at the agent because I don’t care if it sits idle. Similarly if it gets way off base I’m happy to kill the pr more aggressively and start over, what do I care if it wasted its time or if it learns a valuable lesson from the experience?

stopachka

6 days ago

Thank you!

bloat

6 days ago

Seconded. I am very much still in the mode of copying from the chat window and then editing. I would like to have whatever she is having.

gerdesj

5 days ago

Try getting your fav LLM to do something useful with PowerCLI. That's the VMware PowerShell thing. It is quite popular but not enough, so that most prompts will end up with bollocks, masquerading as code.

When a LLM has been tamed to the point that I can define and request a PowerCLI programme, I can be fairly sure that AI is a solved thing.

bayindirh

5 days ago

I'll be a happy and ignorant nut as long as these models are trained on publicly scraped code without their creators' consent, and the models fail to cite the licenses and sources of the parts the answer is composed from.

Oh, GH Copilot was adding citing capabilities which was going to be in production around this time? Any progress on that front?

Thanks.

blibble

6 days ago

> We’re not East Coast dockworkers; we won’t stop progress on our own.

we could choose to be

of course if you're a temporarily embarrassed billionaire like ptacek, you certainly don't want the workers doing this

saghm

5 days ago

Yeah, this is where I went from being on the fence to just checking out entirely. Even if it did mean that the top earners make slightly less (which I'm dubious about) and I happened to be one of them (which I'm not confident in, but I could comfortably survive with a lower salary than I've actually received most of the time), I'd still be in favor of tech workers unionizing because the benefits outweigh the costs both personally and for the larger group.

I've been hired to a fully distributed team only to have the company decide to force everyone to "return" to the office within a couple years despite the team never having been in any office in the first place; I've had managers promise me raises that don't appear, only for me to find out later that they never actually talked to their superiors about it like they claimed; I've seen teammates get hired and laid off six months later the same week that we have someone else we just hired join the team. None of this stuff is reasonable, and for some reason we collectively just throw up our hands and say there's nothing we can do, and then apparently some of us scoff and look down at lower-paid workers banding together to try to get a semblance of the advantages we take for granted in our industry because they're "stopping progress"? Even if I did accept the premise that workers collectively organizing to win rights affected "progress", there's no reason to assume there's no middle ground between the extremes of "no progress" and "full-throttle progress without regard for consequences". It's farcical.

dontlaugh

6 days ago

It’s the class interests that make it very obvious who is trying to lie to you about AI.

The Luddites were of course correct, like most labour movements.

tptacek

5 days ago

Kind of saying the quiet part loud here.

bowsamic

5 days ago

Why not?

tptacek

5 days ago

I mean, I appreciate it being laid out, so I don't have to worry about people saying "absolutely nobody is making that preposterous argument, nobody wants us to be like the east coast dockworkers, it's just...".

bowsamic

4 days ago

Why is it a bad argument?

candiddevmike

6 days ago

This is pretty terrible for the fly brand TBH. Yeesh, politics in a company blog...

YuukiRey

5 days ago

Adding fuel to the fire. I'm extremely disappointed to see such an inflammatory article on fly.io. I wouldn't want this on my personal blog, let alone on my employers, but I guess they're fine with it.

I've been using Zed and Claude Sonnet 4 (and sometimes trying Opus) heavily over the past weeks. For small edits where I have lots of unit tests, the results were great. So great that they worry me with regards to job security. For exploring a new programming domain it was also somewhat useful. I work a lot with the Typescript compiler API right now, and it has almost no documentation. Since the AI can see into every GitHub repository out there, it's much better, and more efficient, at learning APIs based on code from other folks. On the other hand it means I don't do that job, and I am forced to rely 100% on how the AI presents the Typescript compiler API to me. Are there better methods I could use? Who knows.

Where it's abysmal is code architecture. Sometimes it's almost comical: it adds an if statement to handle one highly specific edge case in a program that only makes sense if it solves the general case. This didn't happen often thought.

The hardest part was to force it to reuse existing code from the same file. My use case is transforming a Typescript AST into a GraphQL AST. The code is one big switch statement with lots of recursive calls. The AI would often add 300 lines of code that duplicate some logic which already exists somewhere else.

In the end I rewrote the whole thing from scratch. At around 900 lines of code the AI was starting to really struggle. When I wanted to take over, I realized that I didn't have the in-depth knowledge to do so. And trying to understand the code the AI had written proved futile.

Ultimately that's on me, I should have been more diligent reviewing the dozens of 300 line of code changes the AI throws at me over the course of a day. But I wasn't, because reviewing is really, really hard. For many reasons. And AI makes it even harder.

Am I therefore nuts? I find this whole article extremely one sided. Surely, based on the sheer amount of both positive and negative press, the answer is somewhere in the middle.

neochief

3 days ago

Having the same exact experience as you.

xyst

6 days ago

Ask me again in 15 years. Assuming the world hasn’t already entered a war for the remaining resources on this planet.

justlikereddit

5 days ago

>Important caveat: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields.

What a strange disclaimer. Image generation was a (career)killer feature long before vibe coding was halfway useful.

metalliqaz

6 days ago

Can someone explain to me what this means?

> People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. ...

Is this what people are really doing? Who is just turning AI loose to modify things as it sees fit? If I'm not directing the work, how does it even know what to do?

I've been subjected to forced LLM integration from management, and there are no "Agents" anywhere that I've seen.

Is anyone here doing this that can explain it?

tptacek

6 days ago

I cut several paragraphs from this explaining how agents work, which I wrote anticipating this exact comment. I'm very happy to have brought you to this moment of understanding --- it's a big one. The answer is "yes, that's exactly what people are doing": "turning LLMs loose" (really, giving them some fixed number of tool calls, some of which might require human approval) to do stuff on real systems. This is exactly what Cursor is about.

I think it's really hard to undersell how important agents are.

We have an intuition for LLMs as a function blob -> blob (really, token -> token, but whatever), and the limitations of such a function, ping-ponging around in its own state space, like a billion monkeys writing plays.

But you can also get go blob -> json, and json -> tool-call -> blob. The json->tool interaction isn't stochastic; it's simple systems code (the LLM could indeed screw up the JSON, since that process is stochastic --- but it doesn't matter, because the agent isn't stochastic and won't accept it, and the LLM will just do it over). The json->tool-call->blob process is entirely fixed system code --- and simple code, at that.

Doing this grounds the code generation process. It has a directed stochastic structure, and a closed loop.

metalliqaz

6 days ago

I'm sorry but this doesn't explain anything. Whatever it is you have in your mind, I'm afraid it's not coming across on the page. There is zero chance that I'm going to let an AI start running arbitrary commands on my PC, let alone anything that resembles a commit.

What is an actual, real world example?

tsimionescu

6 days ago

This all works something like this: an "agent" is a small program that takes a prompt as input, say "//fix ISSUE-0451".

The agent code runs a regex that recognizes this prompt as a reference to a JIRA issue, and runs a small curl with predefined credentials to download the bug description.

It then assembles a larger text prompt such as "you will act as a master coder to understand and fix the following issue as faithfully as you can: {JIRA bug description inserted here}. You will do so in the context of the following code: {contents of 20 files retrieved from Github based on Metadata in the JIRA ticket}. Your answer must be in the format of a Git patch diff that can be applied to one of these files".

This prompt, with the JIRA bug description and code from your Github filled in, will get sent to some LLM chosen by some heuristic built into the agent - say it sends it to ChatGPT.

Then, the agent will parse the response from ChatGPT and try to parse it as a Git patch. If it respects git patch syntax, it will apply it to the Git repo, and run something like `make build test`. If that runs without errors, it will generate a PR in your Github and finally output the link to that PR for you to review.

If any of the steps fails, the agent will generate a new prompt for the LLM and try again, for some fixed number of iterations. It may also try a different LLM or try to generate various follow-ups to the LLM (say, it will send a new prompt in the same "conversation" like "compilation failed with the following issue: {output from make build}. Please fix this and generate a new patch."). If there is no success after some number of tries, it will give up and output error information.

You can imagine many complications to this workflow - the agent may interrogate the LLM for more intermediate steps, it may ask the LLM to generate test code or even to generate calls to other services that the agent will then execute with whatever credentials it has.

It's a byzantine concept with lots of jerry-rigging that apparently actually works for some use cases. To me it has always seemed far too much work to get started before finding out if there is any actual benefit for the codebases I work on, so I can't say I have any experience with how well these things work and how much they end up costing.

pxc

5 days ago

The commands aren't arbitrary. They're particular— you write the descriptions of the tools it's allowed to use and it can only invoke those commands.

I'm interested in playing with this, since reading the article, but I think I will only have it run things in some dedicated VM. If it seems better than other LLM use, I'll gradually rely on it more, but likely keep its actions confined to the VM.

IshKebab

6 days ago

> There is zero chance that I'm going to let an AI start running arbitrary commands on my PC

The interfaces prompt you when it wants to run a command, like "The AI wants to run 'cargo add anyhow', is that ok?"

seabrookmx

6 days ago

They're not arbitrary, far from it. You have a very constrained set of tools each agent can do. An agent has a "job" if you will.

Maybe the agent feeds your PR to the LLM to generate some feedback, and posts a the text to the PR as a comment. Maybe it can also run the linters, and use that as input to the feedback.

But the at the end of the day, all it's really doing is posting text to a github comment. At worst it's useless feedback. And while I personally don't have much AI in my workflow today, when a bunch of smart people are telling me the feedback can be useful I can't help but be curious!

steveklabnik

6 days ago

> Is this what people are really doing?

Some people are, and some people are not. This is where some of the disconnect is coming from.

> Who is just turning AI loose to modify things as it sees fit?

In the advent of source control, why not? If it does something egregiously wrong, you can throw it away easily and get back to a previous state with ease.

> If I'm not directing the work, how does it even know what to do?

You're directing the work, but at a higher level of abstraction.

metalliqaz

6 days ago

> You're directing the work, but at a higher level of abstraction.

The article likens this to a Makefile. I gotta say, why not just use a Makefile and save the CO2?

steveklabnik

6 days ago

Being kind of like a Makefile does not mean that they're equivalent. They're different tools, good for different things. That they happen to both be higher level than source code doesn't mean that they're substitutes.

aykutcan

6 days ago

This is how I work:

I use Cursor by asking it exactly what I want and how I want it. By default, Cursor has access to the files I open, and it can reference other files using grep or by running specific commands. It can edit files.

It performs well in a fairly large codebase, mainly because I don’t let it write everything. I carefully designed the architecture and chose the patterns I wanted to follow. I also wrote a significant portion of the initial codebase myself and created detailed style guides for my teammates.

As a result, Cursor (or you can say models you selecting because cursor is just a router for commercial models) handles small, focused tasks quite well. I also review every piece of code it generates. It's particularly good at writing tests, which saves me time.

haiku2077

5 days ago

Zed has a great four minute demo showing how it works: https://zed.dev/agentic

I personally have my Zed set up so the agent has to request every command be manually reviewed and approved before running.

williamcotton

6 days ago

I run Cursor in a mode that starts up shell processes, runs linters, tests etc on its own, updates multiple files, runs the linter and tests again, fixes failures, and so on. It auto stops at 20 iterations through the feedback loop.

Depending on the task it works really well.

metalliqaz

6 days ago

This example seems to keep coming up. Why do you need an AI to run linters? I have found that linters actually add very little value to an experience programmer, and actually get in the way when I am in the middle of active development. I have to say I'm having a hard time visualizing the amazing revolution that is alluded to by the author.

williamcotton

6 days ago

Static errors are caught by linters before runtime errors are caught by a test suite. When you have an LLM in a feedback loop, otherwise known as an agent, then iterative calls to the LLM will include requests and responses from linters and test suites, which can assure the user, who typically follows along with the entire process, that the agent is writing better code than it would otherwise.

tsimionescu

6 days ago

You're missing the point. The main thing the AI does is to generate code based on a natural-language description of a problem. The liners and tests and on exist to guide this process.

The initial AI-based work flows were "input a prompt into ChatGPT's web UI, copy the output into your editor of choice, run your normal build processes; if it works, great, if not, copy the output back to ChatGPT, get new code, rinse and repeat".

The "agent" stuff is trying to automate this loop. So as a human, you still write more or less the same prompt, but now the agent code automates that loop of generating code with an LLM and running regular tools on it and sending those tools' output back to the LLM until they succeed for you. So, instead of getting code that may not even be in the right programming language as you do from an LLM, you get code that is 100% guaranteed to run and passes your unit tests and any style constraints you may have imposed in your code base, all without extra manual interaction (or you get some kind of error if the problem is too hard for the LLM).

icedchai

5 days ago

I let an agent upgrade some old C code that wouldn’t compile and had 100’s of warnings. It was running builds on its own, looking at new errors, etc. It even wrote some tests! I could’ve done this myself but it was a hobby project and tedious work. I was impressed.

adamgordonbell

6 days ago

you are giving it instructions but it's running a while loop with a list of tools and it can poke around in your code base until it thinks it's done whatever you ask for.

See Claude Code, windsurf, amp, Kilcode, roo, etc.

I might describe a change I need to have made and then it does it and then I might say "Now the tests are failing. Can you fix them?" and so on.

Sometimes it works very great. sometimes you find yourself arguing with the computer.

dannyobrien

6 days ago

This is also where I am, and I guess it has been a source of mild and growing consternation since I first blagged an OpenAI GPT account when they were private, in an attempt to get ahead of what was coming -- both the positive and negative sides of the advances. Most people either ignored the advances, or quickly identified and connected to the negative side, and effectively filtered out the rest.

As somebody who comes from a politically left family, and was also around in the early days of the Web, let me tentatively note that this issue has a particular political slant, too. The left has strong roots in being able to effectively critique new developments, economic and social, that don't come from its own engines of innovation which have revolved around those critiques, plus solidarity, organization, and sociopolitical action.

The movement's theorists work far more slowly on how to integrate the effect of those changes into its vision. That means when something like this comes along, the left's cultural norms err on the side of critique. Which is fine, but it makes any other expression both hard to convey, and instantly suspect in those communities. I saw this in the early Web, where from a small group of early adopters of all political slants, it was the independents, heterodox leftists, and the right, -- and most vocally, the libertarians -- who were able to most quickly adapt to and adopt the new technology. Academic leftists, and those who were inspired by them took a lot longer to accomodate the Net into their theses (beyond disregarding or rejecting it) and even longer to devise practical uses for it.

It wasn't that long, I should say -- a matter of months or years, and any latent objections were quickly swamped by younger voices who were familiar with the power of the Net; but from my point of view it seriously set back that movement in practicality and popularity during the 80s and 90s.

I see the same with AI: the left has attracted a large generational of support across the world from providing an emotionally resonant and practical alternative to the status quo many people face. But you quickly lose the mandate of heaven if you fail to do more than just simplistically critique or reject a thing that the average person in the world feels they know better, or feels differently toward, than you do. This is something to consider, even if you still strongly believe yourselves to be correct in the critiques.

steveklabnik

5 days ago

Hey, thanks for this. It helps with some stuff I've been thinking about. I appreciate it a lot, gonna do more thinking.

dannyobrien

5 days ago

That's funny! I mulled it a bit more after reading your piece, and I think some of the comments on lobsters that were considering the whole issue in a wider politicL framing. Thanks for replying!

lomase

5 days ago

I never used AI, microservices, blockchain, Mongo, and used only 2 js frameworks in my carrer.

adammarples

5 days ago

What is the current meta on a fully agentic workflow as he describes it? Wake up, write a vague prompt, go for coffee, and get a PR to review on your phone? I'm genuinely interested to see what going all in on this looks like.

jacobsenscott

5 days ago

I've yet to see an LLM produce anything but disastrous bugs. Remember they don't know they're writing code that can destroy companies and livelihoods. They are just spitting out plausible but always wrong text.

apwell23

5 days ago

> “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.

i don't get it. It keeps making stuff up or gives me wrong stuff that merely compiles when i tell it signature doesn't exist.

rienbdj

5 days ago

I tried Copilot many times and found it lacking. I like discussing concepts (maybe with pseudocode) with Gemini. I find this useful but not game changing.

So genuine question for AI Boosters, what should I be using?

I would like the 10x productivity gains we were promised :)

wan23

5 days ago

Someone at work said something to the effect of "Oh, this would be great for an intern but we don't have one this year" to which I responded "sure you do, its name is Copilot".

panny

6 days ago

Every time I read one of these it feels like I'm reading an AI generated sales pitch for AI.

keepamovin

5 days ago

This is very well written article. I love the style: aggressive, youthful, humorous, informed, experienced.

I think it could do with a little bit more friendly, but… it’s got its own charm.

It also convinced me I should be using agents rather than raw dogging Gemini 2.5 pro

dweinus

5 days ago

> You’ll spend 5-10 minutes knocking it back into your own style

You lost me here. I have often found it to be far more than a 10-minute style issue, but fundamental misunderstanding of the code purposes that I need to fix.

ang_cire

5 days ago

One thing that really bothered me that the author glossed over (perhaps they don't care, given the tone of the article) is where they said:

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic.

But do you know what another part of being a senior developer is? Not just making them more productive, but also guiding the junior developers into becoming better, independent, self-tasking, senior coders. And that feedback loop doesn't exist here.

We're robbing ourselves of good future developers, because we aren't even thinking about the fact that the junior devs are actively learning from the small tasks we give them.

Will AI completely replace devs before we all retire? Maybe. Maybe not.

But long before that, the future coders who aren't being hired and trained because a senior dev doesn't understand that the junior devs become senior devs (and that's an important pipeline) and would rather pay $20/month for an LLM, are going to become a major loss/ brain drain domestically.

raddan

5 days ago

I think what is going to happen is that junior devs will develop a strong reliance on AI tools to be able to do anything. I cynically think this was OpenAI’s aim when they made ChatGPT free for students.

I had a rather depressing experience this semester in my office hours with two students who had painted themselves in a corner with code that was clearly generated. They came to me for help, but were incapable of explaining why they had written what was on their screens. I decided to find where they had lost the thread of the class and discovered that they were essentially unable to write a helloworld program. In other words, they lost the thread on day one. Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.

From one perspective I understand the business case for pushing these technologies. But from another perspective, the long term health of the profession, it’s pretty shortsighted. Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,” and maybe that will leave me with the group that really wants to be there. In the meantime, I will remind students that there is a difference between programming and computer science and that you really need a strong grasp of the latter to be an effective coder. Especially if you use AI tools.

hyperbovine

5 days ago

> Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,”

I see this so much. “Data science major” became the 2020s version of law school. It’s such a double edged sword. It’s led to a huge increase in enrollment and the creation of multiple professional masters programs, so the college loves us. We hire every year and there’s always money for just about anything. On the other hand, class sizes are huge, which is not fun, and worse a large fraction of the students appear to have minimal intrinsic interest in coding or analyzing data. They’re there because it’s where the jobs are. I totally get that, in some sense college has always been that way, but it does make me look back fondly on the days when classes were 1/4 as big and filled with people who were genuinely interested in the subject.

Unfortunately I think I may get my wish. AI is going to eliminate a lot of those jobs and so the future of our field looks a bit bleak. Worse, it’s the very students who are going to become redundant the quickest that are the least willing to learn. I’d be happy to teach them basic analysis and coding skills, but they are dead set on punching everything into ChatGPT.

timdiggerm

5 days ago

> I cynically think this was OpenAI’s aim when they made ChatGPT free for students

Is there any interpretation that makes sense _other_ than this?

leereeves

5 days ago

It's only free for two months. I'm surprised they don't offer a similar free trial for everyone.

asveikau

5 days ago

There are all sorts of stereotypes about frugal students which have truth, but simultaneously, the education market is in some respects a cash cow. Look at tuition costs, or the cost of textbooks. If they're already spending a lot of largely loaned money for education, I guess charging students and having them think it's an education tool gets them some revenue.

zahlman

5 days ago

> Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.

This is nothing new. In a computer graphics class I took over 20 years ago, the median score on the assignments before the midterm was >100% (thanks to bonus questions), yet in midterm prep other students in the class were demonstrating that they didn't even have a firm grasp on the basic concept of a matrix.

That is: they were in a 4th year undergrad course, while doubting material from a senior year high school course where they had to have gotten high marks in order to get into the program.

And the midterm grading was heavily curved as a result (though not as much as in some other courses I took).

Students will do what they need to do for the grade. It seems a great many of them have internalized that none of this is about actually learning anything, even if they would never say so aloud. (I learned things - where I didn't already know them - because it was actually interesting. My resulting grades were pretty good overall, but certainly not top of class.)

> Who knows, in the end maybe this will kill off the group of students who enroll in CS courses “because mom and dad think it’s a good job,”

Why would it? It's becoming easier than ever to fake understanding, and to choose anything else they would need both the opportunity and social permission. I only see the problem getting worse.

paulmist

5 days ago

> Up until this point, both students had nearly perfect homework grades while failing every in-class quiz.

From a student's perspective: I think it was the same with SO. While LLMs make c&p even easier, they also have the upside of lowering the bar on more complex topics/projects. Nowadays, the average person doesn't touch assembly, but we still had a course where we used it and learned its principles. Software engineering courses will follow suit.

Izkata

5 days ago

StackOverflow users at least tried to fight against it. The disdain for anything that looked like "homework questions" was one of the reasons it got a bad reputation among some people.

baq

5 days ago

Yeah they should stop being so hard about these, it isn't like anyone is coming for homework to SO anymore. It's all LLMs and they do not refuse.

zahlman

5 days ago

The problem isn't that the question is "for homework".

The problem is the lack of analysis that goes into producing a useful question for others that fits in with the rest of the site.

True, proper analysis of the homework rarely yields such questions, and even less so in 2025. But the point was always to have a question that's about a clear, specific problem, not about a task that was assigned. Because the latter can only possibly help people who were assigned the same task.

code_for_monkey

5 days ago

they can be a tad overzealous!

jajko

5 days ago

Hard capitalism doesn't care about long term perspectives, the only yard stick is current performance and stock maximization. Otherwise US would a bastion of stellar public education for example, the investment in long term future of whole nation instead of few richest ones sending their kids to private schools, to stay above the rest.

So while I fully agree with you, this is not a concern for a single decision maker in private company world. And state such as US doesn't pick up this work instead, quietly agreeing with this situation.

Well, think for a second who makes similar budget and long term spending focus. Rich lawyers who chose to become much more rich politicians, rarely somebody else and almost never any more moral profession.

esskay

5 days ago

It's a bit misleading to compare $20/month with an actual human person. The junior dev wont get half way through the day and tell you they've used up all their coding time for the month and will now respond with jibberish.

Cursor is a heck of a lot more than $20/month if you actually want it working for a full work day, every day.

falcor84

5 days ago

> The junior dev wont get half way through the day and tell you they've used up all their coding time for the month and will now respond with jibberish.

This issue manifests a bit differently in people, but I've definitely worked with people (not only juniors) who only have a few productive hours a month in them. And for what it's worth, some of those people were sufficiently productive in those few hours that it was rational for the company to keep them.

patrickmay

5 days ago

I worked on a team where a new hire from a prestigious school told his manager "That work is boring and mundane and I'm not going to do it."

He didn't last long.

esskay

5 days ago

Yeah, so you fire them and replace them with another human, thats still vastly cheaper than a person plus a per-token ai fee.

norir

5 days ago

And it wouldn't surprise me if he is now the boss of people like his prior boss.

4hg4ufxhy

5 days ago

My coworkers are burning 10k/month on cursor.

kfajdsl

5 days ago

how?? I consider myself pretty heavy handed with letting Gemini fill up its 1M token context window for individual tasks, and I don't exceed $20 per day. Do they just let the agent spin in a loop for 4 hours?

Closi

5 days ago

Maybe - but it will still probably be less than a junior dev.

You could probably hammer the most expensive cursor API all-day every-day and it would still be a fraction of the cost of a junior dev.

xmodem

5 days ago

Further, Cursor might cost $20/month today, but to what degree is that subsidized by VC investment? All the information we have points to frontier models just not being profitable to run at those types of prices, and those investors are going to want a return at some point.

modzu

5 days ago

the market will indeed balance this out. remember when a taxi was $20 and an uber $5? now an uber is $25. nobody is going to go back to humans with all their wet meat sack problems, we will get more value for it, but it aint gona stay $5 if those putting up all this capital have anything to do with it. then again, we might get cheap, self hostable local copies (unless theyre made illegal for "safety" or some bullshit)

captainbland

5 days ago

I think the most likely thing is the cheap self hostable copies will broadly stop improving significantly. It'll be too costly for a community project to distill a bleeding edge cloud model and companies will stop releasing them. What's free now will remain free, we might even get another gen or 2 of improvements (possibly with diminishing returns) on free/cheap local models but those days are numbered.

awkward

5 days ago

What's the force that you see keeping those days numbered? Many ollama models are runnable on a local laptop, and models like deepseek even more so. Aside from open source interest, there are many large state and corporate actors who would like to see models commoditized.

captainbland

4 days ago

The only real wild card to me is state sponsorship. Everywhere else I expect such a capital intensive tech industry to approach winner (or maybe duopoly) takes all over time.

My view is that right now, because of the willingness of corporations and other investors to swallow short term (but massive) losses on this, we're basically in AI fiscal fantasy land.

The question we should be asking is how do we get access to these local models in the first place? It's all based on the work of these hyper expensive base models as the best small models are quantisations and distills. None of this works as soon as the profit motive comes into play and companies start gatekeeping effectively, which they will.

jes5199

5 days ago

sounds like uber is ripe for disruption by somebody who doesn’t need the accounts to balance yet

wincy

5 days ago

I dunno, with the advances in open source models I could see in a few years having AI workstations that cost $20,000 with 1TB of VRAM so you don’t have to rely on OpenAI or Cursor. The RTX 6000 Pro is only $7500 and has 96GB of VRAM.

elric

5 days ago

This is something that's been simmering in the back of my mind for a while. Using an AI Agent instead of talking to your human colleagues deprives both of you from learning opportunities. There are probably short term gains in many cases, but I fear there will be long term losses over time.

causal

5 days ago

I agree, and think that organizations that figure out how to use AI well in a collaborative way will succeed in the long-term. Developer community is still where the important growth happens.

throw8393494

5 days ago

Is it possible to talk to coworkers? What if you voted for wrong party? Are closeted gay/trans/qeer? Radical femimist?! Or dog atrack survivor, and they really _REALLY_ like dogs!

Talking to colleagues at work is a chore, and huge risk! Not opportunity! At least AI respects my privacy, and will not get my fired!

empath75

5 days ago

> But do you know what another part of being a senior developer is? Not just making them more productive, but also guiding the junior developers into becoming better, independent, self-tasking, senior coders. And that feedback loop doesn't exist here.

Almost every senior developer I know is spending that time making LLM's more productive and useful instead.

Whatever you think the job is of the senior developer, it will not be "coding".

I think people need to stop thinking of themselves as computer programmers and start thinking of themselves as _engineers_. Your job isn't writing programs, your job is _using the technology you have available to solve problems_. Maybe that is through writing code, but maybe it's orchestrating LLM's to write code for you. The important part is solving the problem.

ang_cire

3 days ago

> Almost every senior developer I know is spending that time making LLM's more productive and useful instead.

LLMs may become more productive/ accurate/ useful, but they're not self-tasking or independent.

> I think people need to stop thinking of themselves as computer programmers and start thinking of themselves as _engineers_. Your job isn't writing programs, your job is _using the technology you have available to solve problems_.

There is a progression of skill required to master any profession, starting with fundamentals, and progressing and developing until you are an expert/ senior at that profession. How is a senior sw dev supposed to become that without writing code? Just reading LLM code and bugfixing isn't the same level or kind of experience. You're going to have devs who can't code by themselves, and that's a bad place to be in.

There are already too many people in IT using tools that they don't understand the workings of (and thus can't troubleshoot, can't replace, can't customize to their env, etc), and this will just exacerbate that x100.

MMW there is going to be a very bad skill deficit in IT in 20 years, which is going to cause an innovation deficit.

ignoramous

5 days ago

How many senior developers understand the minute, intimate details of the frameworks, libraries, languages they use? How many understand the databases they use? TFA says, many (but not all) don't have to care as long as the product ships. That's exactly how code written by LLMs is meant to be tested and evaluated. And if you set up a good enough build/test environment, TFA argues that you can automate most of the schelp away.

deltaburnt

5 days ago

In my experience, senior engineers without the curiosity to occasionally dig deeper into their frameworks are significantly worse engineers. No framework, library, or language is perfect. A big part of your job is understanding how your code will execute. Sometimes that even requires, at a very high level, imagining how that code will compile down to assembly once you strip away all the abstractions.

Eventually you will get a memory leak even in a GCd language. Eventually there will be some incredibly obscure, unreported bug in a library. Eventually you will find an issue in unmaintained code you depend on. Eventually there will be performance problems caused by too many layers of abstraction.

You either need to know, roughly, how your dependencies work by occasionally digging into their code/reading the documentation. Or you need intuition to know how it probably works, but you usually build that intuition by actually writing/reading code.

kfajdsl

5 days ago

The people who want to look under the hood even if they have no immediate reason to do so will always exist, and the people who don't care and just learn the bare minimum to produce app features that usually work will always exist. All LLMs do is decrease the skill requirement to be developer B, but they also make it easier to learn what you need to be developer A.

deltaburnt

4 days ago

I think we will end up with significantly more developer B and significantly less developer A. I can't see the future, I don't know how that would play out long term, maybe it will all even out. But I think in the medium term it will be a lot of headaches for someone like me who spends a significant amount of time reviewing code and steering architecture.

Also, I spend a lot of time mentoring, and I'd like to think A will grow to be more like B over time. But now it feels like it's wasted effort to try mentoring those skills if it won't be valued.

kfajdsl

8 hours ago

I definitely think that there will be a far higher % of B than A in the future, but I also think it's accurate to say that the ratio of A:B was much higher in the 70s than now, even though there's a lot more A in absolute terms today. That's just what happens when you lower the barrier to entry.

There's definitely a cohort of on average much lower quality CS graduates though, between COVID and universities not knowing how to deal with AI (I say this as part of that cohort).

samyar

5 days ago

Well i hope that happens, you go apply for a job already 10k applied. What kind of job market is this.

ReptileMan

5 days ago

>We're robbing ourselves of good future developers

You call it robbing ourselves of good future developers, I call it hourly consultancy rate increase.

paffdragon

5 days ago

I imagine it like this. Juniors will be taught by LLMs on some things, but seniors will still be there, they will still assist, pair program, code review, etc. but they will have another party, the LLM, like a smarter calculator.

bitwize

5 days ago

Eric S. Raymond is now doing 100% of his coding with AI assistance, and claims he is much faster and more productive.

I think we've passed the threshold beyond which not using AI is just doing things the hard way for no appreciable gain.

layoric

5 days ago

> Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania.

> The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.

So it starts with a humble strawman, and then the author illustrates how dumb they think their fellow developers really are if they have the make different decisions based on different values.

Most developers I interact with who are what you might call 'AI skeptic', are most concerned with how this tech will be used, especially since its creation is rooted in mass scale disregard for ownership of anything.

> "The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site.... They stand up global-scale piracy networks",

IMO the author is here projecting or something, cause I literally never met someone with this opinion, and I've also been in tech/developer circles for over 20 years. Personal use of pirated content is very different from commercial distribution or making money of share sites. Not everyone's values are so rooted in making money with complete disregard to the impact of their actions in doing so.

I get it, the author wants to keep using LLMs are for people to stop trying to make them feel bad but trying to make a case for how their arguments are just dumb. But the author completely missed the 'why' behind the arguments. In the end, LLMs are a tool. Use them or not is up to the individual. But that doesn't give a blanket social license to use them in any way people, or more importantly, companies want.

mrbluecoat

5 days ago

https://www.nytimes.com/2025/06/02/opinion/ai-coders-jobs.ht...

snickerbockers

6 days ago

>People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. It’s not Yacc output. It’s knowable

okay wtf is this weird jab at yacc supposed to be? I generally prefer to write my own LR parsers because it's really not that difficult but I never felt like yacc code was difficult to read or not "probabilistic".

AFAIK most serious compilers don't use yacc either because they need to be able to parse context-dependent grammars, and also reinventing wheels is often worth the effort when the wheel is 50 years old and forces you to make compromises because it also has to support use-cases you are not interested in. But again, it's not bad, it does the thing it is designed to do and it does save a lot of effort if you're fine dealing with its limitations.

Only serious complaint I have about Yacc is that the basic posix form has frustrating limitations such as not being able to have two separate yacc instances because they'll declare the same symbols, and I'm fairly certain it's not thread-safe either. Bison fixes some of these while still remaining "probabilistic".

Anyways I'd rather use Yacc+Lex (preferably Bison+flex) to generate my parser than have to go read everything your "agent" wrote with a fine-toothed comb to make sure it didn't introduce any crippling memory corruptions or dangerous ROPportunities.

bhupesh

5 days ago

I am in the middle, neither a dogmatic skeptic nor a full-blown prompt engineer, but I lost it when the author compared a junior developer (a human) to a SaaS subscription.

Tells you what you need to know about the AI culture.

speedgoose

5 days ago

I don’t think it’s an uncommon comparison.

If you don’t consider the human aspect, a LLM coding agent has a similar use than a junior developer.

Of course, humans have advantages and those advantages have value. But the coding agent costs $20 dollars a month.

Replacing humans my machines that are much cheaper isn’t really something new.

karel-3d

5 days ago

What is the agent setup that I should try?

What I tried is indeed "set up Cursor" or "copy-paste ChatGPT", and it was all bad, I have no idea what is this "agent" and "MCP" stupp.

jleyank

6 days ago

If they’re regurgitating what’s been learned, is there a risk of copyright/IP issues from whomever had the code used for training? Last time I checked, there’s a whole lotta lawyers in the us who’d like the business.

simonw

6 days ago

Most of the big AI vendors have liability policies in place where their lawyers are committed to step in and help you out if someone sues you for copyright infringement based on code their LLMs wrote for you. I think the term they use is "copyright shield".

OpenAI has that as point 14 in https://openai.com/policies/services-agreement/

Anthropic mention theirs here: https://www.anthropic.com/news/expanded-legal-protections-ap...

Here's the Google post: https://cloud.google.com/blog/products/ai-machine-learning/p...

mudkipdev

5 days ago

Which AI agent is actually good nowadays? I've tried cursor but i didn't want a fork replacing vscode just to be able to call an LLM. And I'd preferably want pay as you go, not a subscription model

CraigJPerry

6 days ago

That was a stellar read. I've had (at least parts of) many of these thoughts floating around my head over the past few weeks / months, but it'd have been beyond my ken to write them down as lucidly.

pianopatrick

5 days ago

Seems to me that AI itself will take some jobs but the cheap workable software the AI creates will take even more jobs, especially if the AI can create workable software for automation robots.

metalrain

5 days ago

I think LLMs are great and patient teachers.

But code is something I like to write myself, it's not about speed of producing code. It's keeping it such you still fully understand it.

FilosofumRex

5 days ago

Don't be pennywise and pound foolish...

The LLM coding agents might help you fire some swe to save a penny or two, but soon enough, it will also help your customers or suppliers to replace your business model.

ReflectedImage

5 days ago

Whilst generative AI is great for writing business letters and essays and all that, going straight for coding, one of, if not the hardest problem is a bit much.

This is a come back in 5 years when it works sort of thing.

1a527dd5

6 days ago

I'm a recent convert. I _was_ a sceptic until I wasn't (https://news.ycombinator.com/item?id=44059847).

stavros

5 days ago

This isn't about the post itself, but can you please put the author's name at the top? It's written in first person and we have no idea who's talking until the end of the post.

mwcampbell

5 days ago

To anyone else who was going to protest that the name _is_ at the top, apparently it depends on one's browser window width and zoom level. I was a bit surprised to find that increasing the zoom level even affects the order when reading with a screen reader.

thomasfl

5 days ago

Sometimes AI model generates technical debt. It happens for instance if AI generates code with dependancies that to old to be automatically be updated to the latest version.

Existenceblinks

5 days ago

Ironically all nuts friends seem to be talking less, less defensive than other nuts friends who keep talking about other people not buying what they believe nuts.

ttiurani

5 days ago

"You don’t believe in IPR. Then shut the fuck up about IPR."

As a software engineer, I need to call out this rhetoric. We are not all morally bankrupt. Many of the people with the most ethical backbone I know are also software engineers.

I don't use "AI" for the same reason I don't use amphetamine. I'm sure I could get stuff more stuff done faster if I used it, but I don't, because the manufacturing and the distribution of it is completely unethical.

The blatant stealing of intellectual property is only the tip of the shitberg. As bad are the giant ecological footprint[1], the colonialism of the grossly underpaid RLHF, and the conceding of even more of our digital lives to trillion dollar corporations.

We can and must ethically evaluate our work, and that includes our tools. Nobody's perfect, but doing one immoral thing does not mean you get a free pass to skip morality altogether.

1. https://www.greenpeace.de/publikationen/environmental-impact...

msteffen

5 days ago

This article is ridiculous

Q: but the code that AI generates is incoherent gibberish

Guess what, life’s tough. Fixing incoherent gibberish is part of the job you wuss. Honestly, I bet you write incoherent gibberish. Would you rather be fixing incoherent gibberish that an AI wrote or incoherent gibberish that you wrote?

Q: but I have to re-do everything it touches. Everything takes me ten times as long!

What are you, some kind of vibe-coding YouTuber? Everything taking ten times as long is part of the job, cupcakes. What’s important is that now it takes ten times as long automatically. I like the way Zed does it, where it tells you to leave while it’s changing stuff.

kurko

5 days ago

What's the likelihood that this was 90%+ written by an LLM? It'd be ironic, especially now that there are 2000+ engineers commenting on it.

johannesrexx

3 days ago

OP opened with an ad hominem argument that his skeptic friends are all nuts.

OP therefore loses the argument.

homeonthemtn

5 days ago

AI is a tool, akin to a seamstress' sewing machine. Use it well, and with intention, and it'll help speed up your work. That's it.

Ygg2

5 days ago

> but it’s bad at rust

It's not just Rust. It's everything that has been created post the last vacuum. You need a large corpus for LLM to get decent-ish.

deadbabe

6 days ago

I can’t wait for the day when people no longer manually write text messages to each other, but instead just ask LLMs to read and respond from a few prompted words.

Mofpofjis

5 days ago

next level: use AI agents in their dating apps on both sides to decide if they want to hook up

deadbabe

5 days ago

Just send AI agents on a simulated date and see how it turns out.

keeganpoppen

6 days ago

this is one of those fascinating cases where i agree with none of the arguments, but vehemently agree with the conclusion… it ordinarily would give me pause, but in this case i am reminded that nonsense arguments are equally applicable to both sides of the debate. if the arguments actually had logical connection to the conclusion, and i disliked the arguments but liked the conclusion, that would be real cause for introspection.

Yossarrian22

6 days ago

Where is the counter argument to this not being sustainable?

guywithahat

6 days ago

I develop space-borne systems, so I can't use the best LLM's for ITAR/etc reasons, but this article really makes me feel like I'm missing out. This line in particular makes me wonder if my skills are becoming obsolete for general private industry:

Every once in a while I see someone on X posting how they have 10 agents running at once building their code base, and I wonder if in 3 years most private industry coders will just be attending meetings to discuss what their agents have been working on, while people working on DoD contracts will be typing things into vim like a fool

ghc

6 days ago

> while people working on DoD contracts will be typing things into vim like a fool

Forget LLMs, try getting Pandas approved. Heck I was told by some AF engineers they were banned from opening Chrome Dev Tools by their security office.

FWIW I think the LLM situation is changing quite fast and they're appearing in some of our contracts. Azure-provided ones, of course.

fellowniusmonk

6 days ago

Frankly, as someone who is engaged in fields where LLMs can be used heavily.

I would stay in any high danger/high precision/high regulation role.

The speed at which LLM stuff is progressing is insane, what is cutting edge today wasn't available 6 months ago.

Keep up as a side hobby if you wish, I would definitely recommend that, but I just have to imagine that in 2 years a turnkey github project will get you pretty much all the way there.

Idk, that's my feeling fwiw.

I love LLMs but I'm much less confident that people and regulation will keep up with this new world in a way that benefits the very people who created the content that LLMs are built on.

threeseed

6 days ago

> The speed at which LLM stuff is progressing is insane

You clearly haven't been following the space or maybe following too much.

Because the progress has been pretty slow over the last years.

Yes modals are cheaper and faster but they aren't substantially better.

fellowniusmonk

6 days ago

Over the last years? As in two years or more? Could you explain that a bit more?

I consider "LLM stuff" to be all inclusive of the eco-system of "coding with LLMs" in the current threads context, not specific models.

Would you still say, now that the definition has been clarified, that there has been slow progress in the last 2+ years?

I am also curious if you could clarify where we would need to be today for you to consider it "fast progress"? Maybe there is a generational gap between us in defining fast vs slow progress?

3abiton

5 days ago

> For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields. I just don’t believe them about mine.

I am curious about this part though

paulsutter

6 days ago

The best I can offer skeptics is the more you work with the tools the more productive you become. Because yes the tools are imperfect.

If you've had a dog you know that "dog training" classes are actually owner training.

Same with AI tools. I see big gains for people who spend time to train themselves to work within the limitations. When the next generation of tools come out they can adapt quickly.

If this sounds tedious, thats becuase it is tedious. I spent many long weekends wrestling with tools silently wrecking my entire codebase, etc. And that's what I had to do to get the productivity improvements I have now.

trostaft

5 days ago

Thanks for the article. I've been unwilling to try agentic editing for a while, and this is the kick I need to give it a shot.

Bluescreenbuddy

5 days ago

>but the code is shitty, like that of a junior developer >Does an intern cost $20/month? Because that’s what Cursor.ai costs.

RIP Junior devs

hlpn

5 days ago

Very weak arguments and intentionally provocative.

ivolimmen

5 days ago

Call me old but I am not using AI for coding. It's not what I expect or want an AI to do for me. I would love for it to do stuff I don't like to do. It's like the meme that was floating around: that a woman says it would have loved the AI to do the dishes so she can to art; not the other way around. Yes I am also a skeptic but that is not weird as I have seen to many movies about AI taking over the world. I hope I never have to say: "I told you so". I really do see positive points to AI but just not in coding. I would not even trust them to write tests. Might make them write documentation at some point.

I would like to point out: LLM's look really intelligent until you converse with them in a less popular language, it will make weird mistakes and interpret your questions all wrong. I am Dutch and they really suck at that.

rk06

5 days ago

Can someone guide me which AI tools are good? And their docs and tutorial

I want to see the hype but without paying up front. My focus area is c#, .net, js.

wiseowise

5 days ago

You’re the goat, Ptacek.

Seriously, this is the most based defense of LLM that I’ve read over the years. All the yapping in the thread only proves your points.

fourthark

5 days ago

LLM, please find all occurrences of "shut the fuck up" and suggest actual arguments for the author to use.

Otherwise, well written and persuasive.

storus

6 days ago

Soon all coding will look like L3 support - debugging something you've never seen before, and under pressure. AI is really taking away the fun parts from everything and leaving just the drudgery in place.

FridgeSeal

6 days ago

“What do you mean you want to think about our architecture? Just get the LLM to do it, and we’ll get it to fix it if anything goes wrong”

“No we’re not allocating any time to thinking about the design, just get the LLM to do it”

I’m so excited for the bleak future.

sanderjd

6 days ago

The key is to figure out how to move up the ladder of abstraction. You don't want to be a "coder" in a world where AI can code, but you do want to be a "person who makes software" in a world where making software just got easier.

layer8

6 days ago

Most people who chose the profession don’t want that, though. They like the coding, and dislike managing.

sanderjd

6 days ago

How does "person who makes software" imply "managing"?

I understand that "coding" is the fun part for lots of people, especially younger people. This is me as well, so I'm definitely sympathetic to it, and feel quite a bit of sadness about this.

Lots of people also enjoy woodworking and machining by hand, but that's not how most furniture or machines are made.

If I were independently wealthy, I might well spend some of my time making artisan software, but as a professional and entrepreneur, I'm going to try to use efficient tools for the job.

layer8

6 days ago

This relates to the analogy that an LLM is like a junior developer that you have to instruct and guide, and whose work you have supervise and review. Working with an LLM is similar to managing people as a tech lead. And once LLM agents get smart enough and reliable enough, the work will be similar to that of a product manager, project leader, CTO, or even CEO.

If you like being an entrepreneur, you’re already different from most professional software developers.

sanderjd

5 days ago

I'm aware of that analogy, and thought that might be what you were alluding to, but I don't think it's a good analogy.

I agree with you that most professional software developers don't like being entrepreneurs, but I think that has more to do with disliking the parts of entrepreneurship that don't fit into "person who makes software", like fundraising and marketing.

But I think many - maybe most, but not all - professional software engineers actually do enjoy "making software", generally, and not just "coding", narrowly.

layer8

5 days ago

I agree that they enjoy "making software", but typically not if it involves a lot of instructing LLMs with natural language and reviewing their output. Again, this will become very similar to the job of a project or product manager who doesn't do any coding themselves. Developers tend to not like that kind of role.

sanderjd

4 days ago

Maybe! But I think this is very debatable, or at least, impermanent.

To me, it is very reminiscent of when I was a wee lad and there was a prevalent view that no real developer would want to make web applications, that was for script kiddies. But it turned out to be a useful kind of software to build, so a lot of people who build software started building web applications.

I also think that lots of developers do naturally enjoy increasingly higher levels of work than "just" writing code, as they progress in their careers. The job of "staff+ engineers" has significant overlap with what product managers do, where the "product" they're thinking about is the technical design of a larger system. Lots of developers really do enjoy this kind of work!

threeseed

6 days ago

People said the same about VB style coding then low-code and now AI.

They have been wrong every time and will continue to be wrong.

storus

6 days ago

This feels different; I asked DeepSeek R1 to give me an autoregressive image generation code in pytorch and it did a marvelous job. Similar for making a pytorch model for a talking lip-synced face; those two would take me weeks to do, AI did it in a few minutes.

Autoregressive LLMs still have some major issues like over-dependency on the first few generated tokens and the problems with commutative reasoning due to one-sided masked attention but those issues are slowly getting fixed.

threeseed

6 days ago

People used to tell me all the amazing things no-code and low-code was able to do as well.

And at the end of the day they went nowhere. Because (a) they will never be perfect for every use and (b) they abstract you from understanding the problem and solution. So often it will be easier to just write the code from scratch.

sanderjd

6 days ago

No-code and low-code tools have been very successful...

tilne

5 days ago

They fill a need, but I wouldn’t say they e been “replace people’s jobs” successful, as this context hints at.

sanderjd

5 days ago

They have definitely replaced jobs! There are tons and tons of sites and applications now that have been built with these tools and otherwise would have been built by software developers using wordpress or drupal or rails or whatever.

But that doesn't mean the developers who would have otherwise done that work were just disemployed by the success of these tools. No, they just worked on different things.

And I think that is a valuable lesson that can be applied (though I think not perfectly) to this LLM era.

slowhadoken

5 days ago

There is no AI yet, it’s just automation and that’s why people hate it. It also allows corporations to consume open source code for profit.

skeptrune

5 days ago

> I think this is going to get clearer over the next year. The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality. I’m snarking about these people, but I meant what I said: they’re smarter than me. And when they get over this affectation, they’re going to make coding agents profoundly more effective than they are today.

Yes, yes, and yes. SOOOO much yes. We are going to be in for a wild ride once everyone gets on board with the reality of coding agents.

evantbyrne

5 days ago

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

The cost is a pretty obvious indication that it's nowhere near the capability of an engineer. A robotic worker that can produce code 24/7 without tiring would be worth even more than a person of equivalent skill. I don't know what kinds of goofy hiring ya'll are doing, but all of the juniors I've been involved in hiring actually worked out great.

chairmansteve

5 days ago

I find it interesting that TFA claims that Go is the most LLM legible language.

I suspect C# Asp.Net Core not so much. Any thoughts?

isatty

5 days ago

I don’t use AI and I don’t really care for it, but the language in this article is really bad.

I expected better from Thomas.

bhaktatejas922

5 days ago

I find the claims of 2-5x more productivity insane. I'm at least 10x more productive.

nsagent

5 days ago

I was willing to read until I got to this part. He can shove this drivel up his own ass for all I care if that's how he thinks. What a waste of time.

mirkodrummer

6 days ago

> but the code is shitty, like that of a junior developer > Does an intern cost $20/month? > Because that’s what Cursor.ai costs.

So sick of hearing comparisons between LLMs and Juniors, first of all are we really are seniors? After almost 20 years sometimes I still think I know nothing. Second, when I was a Junior just me being younger, fresher and with an eager learner attidute brought a net positive impact on the team I joined back then, that I still hang out with those people to this day and we still have fun joining new projects together. I'm so sick of these snobbish "seniors" that keep throwing rocks at younger and well prepared people

lucaspauker

5 days ago

> Also: let’s stop kidding ourselves about how good our human first cuts really are.

dburkland

5 days ago

Really enjoyed this article and how it was written, especially the “tookr-jerbs” part.

voidUpdate

5 days ago

but I like writing code, rather than reviewing code somebody/thing else wrote

eqmvii

5 days ago

Coming around to this conclusion myself after experimenting with the tools for a few weeks.

Things have changed.

shalmanese

5 days ago

I mean, the root cause of this is quite mundane.

People like to think that intelligence is some intrinsic force but it's far better thought of as situational. We are contextually smarter or dumber based on the emotions we're trying to process from a situation and how our motivation reasoning allows us to avoid certain negative emotions.

The subset of people the OP is trying to argue against are people who are otherwise normally intelligent but have an emotional need for AI to be unimportant that they're unable to process so they build a social community of terrible arguments.

The nature of the internet is that visibility is terribly correlated (and often anti-correlated) with popularity so OP thinks this is some large contingent of people when it's instead algorithmically served conflict generators.

Writing against that is also an algorithmic conflict generator, as evidenced by the number of upvotes here. The correct response is to realize that people putting up terrible arguments about AI don't actually care about AI, they're using this issue to process unresolved trauma from other, unrelated areas of their life and taking their ideas literally instead of seriously is a waste and diversion. There are plenty of smart people with far more nuanced views that still disagree with you where they're interested in the content of the argument.

hakanderyal

5 days ago

Spot on. Current gen LLMs also has the ability to mimic your code style. I always point it to existing code, and the result is the same as what I would have written.

Yesterday I “wrote” 18k lines of code with Claude & Cursor. Mostly UI, some backend. This would have taken me a week without LLMs.

I’m sold.

JodieBenitez

5 days ago

I don't get the focus on the coding. In all the projects I worked on for 25 years, coding has never been the bottleneck in projects. Maybe the projects I work on are simple, maybe I was lucky, but still: once the requirements are clear, coding is easy. And it's literally what takes the least time. I don't want to spend less time on coding thanks to AI. Because coding is what I like, not the rest of the job. Bring me AI that will do the meetings for me please.

sebstefan

5 days ago

>Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.

Maybe you are, but I didn't get a Master's degree to produce quick shit for shareholders

_moof

5 days ago

*sighs in aerospace*

Let me know when I can use one of these without violating ITAR/EAR.

haiku2077

5 days ago

You can, if you run the models locally yourself on your own hardware, e.g. Ollama on a Mac Mini.

ookblah

5 days ago

basically nails every point i've wanted to say but couldn't lol. also in the end the endless posturing gets tired and we go back to getting shit done. either it helps you are it doesn't.

aussieguy1234

5 days ago

I have a trick to improve code quality from LLMs. I only use the ones where I have control of the system prompt.

I use a prompt that convinces the LLM they are a Staff Engineer at Google and care deeply about best practices and writing clean, quality code.

While not perfect, it does improve the code quality.

philosophty

6 days ago

"they’re smarter than me" feels like false humility and an attempt to make the medicine go down better.

1. Thomas is obviously very smart.

2. To be what we think of as "smart" is to be in touch with reality, which includes testing AI systems for yourself and recognizing their incredible power.

mrkurt

6 days ago

It's not false. He's talking about people smarter than him (at writing and shipping infrastructure code).

Thomas is the smartest at other things.

philosophty

6 days ago

It is false and you're proving it. Smarter means smarter.

Smarter does not mean "better at writing and shipping infrastructure code."

Some of the smartest people I know are also infra engineers and none of them are AI skeptics in 2025.

thousand_nights

6 days ago

i feel like surprisingly, front end work which used to be viewed by programmers as "easier" is now more difficult of the two, because it's where LLMs suck the most

you get a link to a figma design and you have to use your eyes and common sense to cobble together tailwind classes, ensure responsiveness, accessibility, try out your components to make sure they're not janky, test out on a physical mobile device, align margins, padding, truncation, wrapping, async loading states, blah blah you get it

LLMs still suck at all that stuff that requires a lot of visual feedback, after all, you're making an interface for humans to use, and you're a human

in contrast, when i'm working on a backend ticket ai feels so much more straightforward and useful

simonw

6 days ago

Programmers who think front end is "easier" than backend have been wrong for well over a decade. https://simonwillison.net/2012/Feb/13/why-are-front-end/

calebm

5 days ago

This article is very well-written. Obviously not written by AI.

TrackerFF

6 days ago

My take is: It is OK to don't buy into the hype. There's a lot of hype, no denying that.

But if you're actively avoiding everything related to it, you might find yourself in a position where you're suddenly being left in the dust. Maybe not now, not next month, not next year, but who some time in the future. The models really are improving fast!

I've talked with devs that (claim they) haven't touched a model since ChatGPT was released - because it didn't live up to their expectations, and they just concluded it was a big nothingburger.

Even though I don't follow the development religiously anymore, I do try to get acquainted with new releases every 3 months or so.

I hate the term "vibe coding", but I personally know non-tech people that have vibe coded products / apps, shipped them, and make more money in sales than what most "legit" coders are making. These would be the same "idea people" that previously were looking for a coder to do all the heavy lifting. Something is changing, that's for sure.

So, yeah, don't sleepwalk through it.

FridgeSeal

6 days ago

The counter-argument as I see it is that going from “not using LLM tooling” to “just as competent with LLM tooling” is…maybe a day? And lessening and the tools evolve.

It’s not like “becoming skilled and knowledgeable in a language” which took time. Even if you’re theoretically being left behind, you can be back at the front of the pack again in a day or so. So why bother investing more than a little bit every few months?

simonw

6 days ago

> The counter-argument as I see it is that going from “not using LLM tooling” to “just as competent with LLM tooling” is…maybe a day? And lessening and the tools evolve.

Very much disagree with that. Getting productive and competent with LLM tooling takes months. I've been deeply invested in this world for a couple of years now and I still feel like I'm only scraping the surface of what's possible with these tools.

FridgeSeal

5 days ago

Does it take months _now_ or did it take months, months/a year ago?

I’m still not entirely sure why it’s supposed to take months. I usually retry every few weeks whenever a new model comes out, and they get marginally better at something, but using them isn’t a massive shift? Maybe I’m missing something? I have some code, I pop open the pane, ask it, accept/reject the code and go on. What else is everyone even doing?

Edit: I’ve seen the prompt configs people at work have been throwing around, and I’m pretty glad I don’t bother with cursor-and-friends when I see that. Some people get LLM’s to write git commits? Lazygit has made most of my git workflow friction disappear and the 1 minute it takes me to write commits and pr’s is less effort than having to police a novel writing incorrect ones.

adpirz

5 days ago

Plug for Simon's (very well written) longer form article about this topic: https://simonwillison.net/2025/Mar/11/using-llms-for-code/

stock_toaster

6 days ago

I think the more "general" (and competent) AI gets, the less being an early adopter _should_ matter. In fact, early adopters would in theory have to suffer through more hallucinations and poor output than late adopters.

Here, the early bird gets the worm with 9 fingered hands, the late bird just gets the worm.

abdullin

6 days ago

It takes deliberate practice to learn how to work with a new tool.

I believe that AI+Coding is no different from this perspective. It usually takes senior engineers a few weeks just to start building an intuition of what is possible and what should be avoided. A few weeks more to adjust the mindset and properly integrate suitable tools into the workflow.

breuleux

6 days ago

In theory, but how long is that intuition going to remain valid as new models arrive? What if you develop a solid workflow to work around some limitations you've identified, only to realize months late that these limitations don't exist anymore and your workflow is suboptimal? AI is a new tool, but it's a very unstable one at the moment.

abdullin

6 days ago

I'd say that the core principles stayed the same for more than a year by now.

What is changing - constraints are relaxing, making things easier than they were before. E.g. where you needed a complex RAG to accomplish some task, now Gemini Pro 2.5 can just swallow 200k-500k of cacheable tokens in prompt and get the job done with a similar or better accuracy.

breuleux

6 days ago

Sure, the core principles are mostly the same, but the point is that it is getting easier and easier to extract value from these models, which means that the learning curve is getting flatter. The main difficulty for now and for the foreseeable future is to get the models to do what we mean, but DWIM is the trend line, it's the objective everyone's trying to get at. Even if we don't quite reach, we'll get closer. And AI that does what you mean is the ultimate tool: it doesn't require any expertise at all. There is no first mover advantage (save for a hypothetical singularity, perhaps).

thesz

5 days ago

I thought so about strong static typing looong time ago.

system2

5 days ago

I am still using ChatGPT like a caveman and am happy with it.

v3xro

5 days ago

Another one, here we go :) I think I'm going to keep it short because I am not here to convince people they're wrong, just to voice my dissent from uh almost every viewpoint this article comes from.

> Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans.

Well that's just wrong - we have failed to capture tremendous value through knowledge of our craft and to reduce it down to min-maxing effort/profit is not a pursuit that I can get behind. Nobody is talking about building Ivory towers, but we can do a lot better than most of the common denominator technologies in use today. We're failing our users most of the time (and of course people like the author say it doesn't matter because it doesn't show up in the quarterly results).

> Nobody cares if the logic board traces are pleasingly routed.

Speak for yourself.

tkgally

5 days ago

Both off and on topic: I got access to Claude’s voice mode today, and for my second test I asked it for its thoughts about both the linked blog post and the HN discussion here. The full transcript is at [1]. Some parts that stood out (slightly edited):

Me: So what is your assessment of the discussion at Hacker News?

Claude: The Hacker News discussion strikes me as much more grounded than the blog post itself. What I find compelling is that many commenters are sharing specific technical experiences rather than making broad ideological arguments. When someone describes spending hours fighting with Claude agents that keep refactoring code into spaghetti, that's concrete data about the current state of these tools. ...

Me: Well, what about you, Claude? You are made by a company whose president has recently made some public statements saying that, you know, 50% of white collar jobs will be replaced soon. And the company, Anthropic, also is very aggressively developing and selling tools to help with AI coding. What are your thoughts about that?

Claude: That's a really fair question to ask me directly. I think there's an obvious tension in my position here since I'm built by a company that has clear financial incentives around AI adoption, including coding tools. When I look at my own capabilities honestly, I can see both sides of this. I'm genuinely useful for certain coding tasks like explaining unfamiliar code, generating boilerplate, or helping debug specific problems. I've helped people learn new programming languages faster than they might have otherwise. But I'm also clearly not at the level where I could replace an experienced software engineer working on complex system design or novel problem solving. ...

[1] https://claude.ai/share/c934e67c-f132-43e8-8c31-ab238b5be25a

collinvandyck76

5 days ago

Thank you for sharing. I'm looking forward to getting voice mode, but after doing some searching I saw that it still needed the user to hit the "send" button after they are done talking. Did I misread, or was your experience that it would reply to you after you finished talking without the need to press anything?

tkgally

5 days ago

When you’re done talking, you tap anywhere on the screen to indicate that you’re finished talking, and Claude replies a second or two later. If you are silent for several seconds, it goes ahead and replies without waiting for you to tap. When Claude is talking you can tap on the screen to interrupt; that is actually quite useful, as it can be a bit long-winded.

It’s not quite as natural a back-and-forth as with ChatGPT’s or Gemini’s voice mode on mobile, but maybe that’s for the best.

I tried a test later in which I gave it a research paper—Sakana AI’s latest—and asked it to explain it to me. When it used a term I didn’t understand, I just tapped on the screen and asked for an explanation. That worked quite well.

collinvandyck76

5 days ago

Thanks a lot. This sounds great for my envisioned use case which is having conversations while on a road trip to research ideas that pop into my head. Not having to tap the screen makes it much safer.

ta_20250603

5 days ago

My AI enamored friends are all nuts.

I get it. You're being so productive. (Or so you claim. I don't believe you.)

But if I don't affirm your work choices, or embrace them myself, you lose your mind like a little caffeinated squirrel.

> If you can’t metabolize the boring, repetitive code an LLM generates: skills issue!

Case in point. Touch grass.

Joel_Mckay

5 days ago

Slop adding implicit arbitrary higher order tensors to linear problems.

Sounds cool, but not useful... lol

Reminds one of a classic fictional conversation... =3

https://www.youtube.com/watch?v=X6WHBO_Qc-Q

zahlman

5 days ago

> Level Setting

> First, we need to get on the same page. If you're [6 months out of date with the technology you think of as overhyped after having observed it for the last 4 years or so], you're [not fairly evaluating it].

Perhaps, but this is an extremely unconvincing approach to the argument right out of the gate - it's basically on the same level as "communism has never been tried".

Especially if you're going to start out by describing your interlocutors as "crazy", rather than positing that an effectively completely new thing now exists that ought to be evaluated differently from Copilot et. al.

forty

6 days ago

Why would anyone rather read and fix someone else code rather than writing the code themselves? I do a lot of code review for other human code and it use so much more energy than writing my own code (and surely, as I have competent colleagues, this is not even as bad as if I expected that the code that I'm reading could be totally random shit)

Mofpofjis

5 days ago

> Why would anyone rather read and fix someone else code rather than writing the code themselves?

Because their own review standards are low (so they find reviewing "easy"), and/or because they can't appreciate the emotional & mental fulfillment that coding provides.

aucisson_masque

6 days ago

I think it comes all down to that, do you have pride in what you do or you don’t ?

I make a wall with bricks, even if it will be covered with coating i will do my best to have regular joints and pacing.

Could make it faster, no one would notice the difference but me… i hate that feeling when you done something and you know it’s barely enough, just barely, it’s kind of shit and you really don’t want others to see it.

On the opposite side, some people will take pride in building wall twice as fast as me and won’t care it’s horrendous.

Both cases are valid, but me i know i can’t do a work I’m not proud of.

joshdata

5 days ago

Totally agree with you that pride is important, but there's definitely more. Being a good engineer is frequently a problem of design --- whether it's user experience or code abstractions. Design is partly art, and that makes us artisans at least some of the time. The code that I've written that has endured the longest --- decades --- has been code that was designed well.

Mofpofjis

5 days ago

> I make a wall with bricks, even if it will be covered with coating i will do my best to have regular joints and pacing.

Absolutely. This is at the core of it.

bgwalter

5 days ago

I'd be interested in the open source repositories that the author of this article created a) before "AI" and b) after "AI". I mean significant work mostly done by him.

Wowfunhappy

6 days ago

The most salient part of the article IMO:

Under the heading: "but you have no idea what the [LLM's] code is."

> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you? You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code.

> People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. [...] The LLM might be stochastic. But the LLM doesn’t matter. What matters is whether you can make sense of the result

> Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates [...] how are you handling the chaos human developers turn out on a deadline?

mcv

5 days ago

I wasn't familiar with these agents yet. Does anyone know what the best ones are? Is there a good comparison of the options and their strengths and weaknesses?

timewizard

6 days ago

> You’ll only notice this happening if you watch the chain of thought log your agent generates. Don’t.

"You're nuts!" says the guy with his head intentionally buried in the sand. Also way to tell me your business model is a joke without telling me your business model is a joke. Enjoy it while it lasts.

disambiguation

6 days ago

Man the redbull is oozing off this post, talk about sipping rocket fuel.

I mean a tool is a tool, nothing wrong with that - but most of the resistence stems from AI being shoved down our throats at warp speed. Its already everywhere and I can't opt out, that stinks.

As for the skepticism in terms of adoption and usefulness, its mainly a question of whether or not it will continue improving - there's no way to no what lies ahead, but if it came to a grinding halt today well then the high water mark just isn't all that impressive.

> Yeah, we get it. You don’t believe in IPR. Then shut the fuck up about IPR. Reap the whirlwind.

This is the point that matters, and I don't think everyone is on the same page that LLMs are essentially over glorified data laundering.

The industry would get just as much "value" if we declared a jubilee and wiped out all licenses and allowed unlimited plagiarism (Looking at Zuckerburg and his 10 TB of pirated data). In fact, if AI owners published their training data sets with a capable search engine, I would bet money of it out performing LLMs in most cases. Why waste all that man power reinventing Netflix again? Just copy paste the code and give everyone their time back, sheesh.

I'm in a role that is behind the times, using a bespoke in-house framework that is immune to the benefits of LLMs, so I don't get to see what you see - so as a skeptic, I'm not convinced this isn't just the illusion of speed. I have not seen convincing results, show me the amazing things being made by AI (AI tooling itself does not count) - but yes, maybe that's because its all siloed into walled gardens.

> But something real is happening. My smartest friends are blowing it off. Maybe I persuade you. Probably I don’t. But we need to be done making space for bad arguments.

Yeah all the arguments have been made, good and bad, we're all waiting to see how it plays out. But I'd rather take the side of being a skeptic - if I'm right then I'm in the right place. If I'm wrong, that's cool too, I don't mind playing catch-up. But fully embracing the hype is, IMO, tantamount to putting all your eggs in one basket, seems like a needless risk but if that's worth it to you to get ahead then by all means, slurp up the hype.

consumer451

6 days ago

I am just some shmoe, but I believe that devs fall into to major categories when it comes to LLMs: those with their own product ideas, and those without their own product ideas.

The prior look upon Claude Code/Cursor/Windsurf much more favorably, as they are able to ship their ideas much faster.

This is a bit of a hot take, so I would love any replies to bring me back down to earth.

PokerFacowaty

5 days ago

This is a great article and I think I agree with most of it, not being an agent user myself.

One point the author might be missing is:

> If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing

The tech giants and AI bros alike have been bombarding us with "but look what it can do NOW", "the new shiny model is revolutionary, now buy tokens", "look what's around the corner" and whatever else for over 2 years now. Can you really blame a typical "skeptic" for not being interested in putting their time to veryfing these claims themselves after seeing mediocre results so many times? You start to develop a filter for these things.

ofjcihen

6 days ago

I feel like we get one of these articles that addresses valid AI criticisms with poor arguments every week and at this point I’m ready to write a boilerplate response because I already know what they’re going to say.

Interns don’t cost 20 bucks a month but training users in the specifics of your org is important.

Knowing what is important or pointless comes with understanding the skill set.

mountainriver

6 days ago

I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.

The criticisms I hear are almost always gotchas, and when confronted with the benchmarks they either don’t actually know how they are built or don’t want to contribute to them. They just want to complain or seem like a contrarian from what I can tell.

Are LLMs perfect? Absolutely not. Do we have metrics to tell us how good they are? Yes

I’ve found very few critics that actually understand ML on a deep level. For instance Gary Marcus didn’t know what a test train split was. Unfortunately, rage bait like this makes money

Night_Thastus

6 days ago

Models are absolutely not improving linearly. They improve logarithmically with size, and we've already just about hit the limits of compute without becoming totally unreasonable from a space/money/power/etc standpoint.

We can use little tricks here and there to try to make them better, but fundamentally they're about as good as they're ever going to get. And none of their shortcomings are growing pains - they're fundamental to the way an LLM operates.

mountainriver

6 days ago

Most of the benchmarks are in fact improving linearly, we often don't even know the size. You can find this out but just looking at the scores over time.

And yes, it often is small things that make models better. It always has been, bit by slow they get more powerful, this has been happening since the dawn of machine learning

_dain_

6 days ago

remember in 2022 when we "hit a wall"? everyone said that back then. turned out we didn't.

and in 2023 and 2024 and january 2025 and ...

all those "walls" collapsed like paper. they were phantoms; ppl literally thinking the gaps between releases were permanent flatlines.

money obviously isn't an issue here, VCs are pouring in billions upon billions. they're building whole new data centres and whole fucking power plants for these things; electricity and compute aren't limits. neither is data, since increasingly the models get better through self-play.

>fundamentally they're about as good as they're ever going to get

one trillion percent cope and denial

jhonof

5 days ago

The difference in quality between model versions has slowed down imo, I know the benchmarks don't say that but as a person who uses LLMs everyday, the difference between Claude 3.5 and the cutting edge today is not very large at all, and that model came out a year ago. The jumps are getting smaller I think, unless the stuff in house is just way ahead of what is public at the moment.

yahoozoo

5 days ago

Yet we are still at the “treat it like a junior” level

nickpsecurity

5 days ago

"pretty much every metric we have shows basically linear improvement of these models over time."

They're also trained on random data scraped off the Internet which might include benchmarks, code that looks like them, and AI articles with things like chain of thought. There's been some effort to filter obvious benchmarks but is that enough? I cant know if the AI's are getting smarter on their own or more cheat sheets are in the training data.

Just brainstorming, one thing I came up with is training them on datasets from before the benchmarks or much AI-generated material existed. Keep testing algorithmic improvements on that in addition to models trained on up to date data. That might be a more accurate assessment.

mountainriver

5 days ago

thats not a bad idea, very expensive though, and you end up with a pretty useless model in most regards.

A lot of the trusted benchmarks today are somewhat dynamic or have a hidden set.

nickpsecurity

4 days ago

That could happen. One would need to risk it to take the approach. However, if it was trained on legal data, then there might be a market for it among those not risking copyright infringement. Think FairlyTrained.org.

"somewhat dynamic or have a hidden set"

Are there example inputs and outputs for the dynamic ones online? And are the hidden sets online? (I haven't looked at benchmark internals in a while.)

attemptone

6 days ago

>I feel the opposite, and pretty much every metric we have shows basically linear improvement of these models over time.

Wait, what kind of metric are you talking about? When I did my masters in 2023 SOTA models where trying to push the boundaries by minuscule amounts. And sometimes blatantly changing the way they measure "success" to beat the previous SOTA

mountainriver

6 days ago

Almost every single major benchmark, and yes progress is incremental but it adds up, this has always been the case

attemptone

5 days ago

We were talking about linear improvements and I have yet to see it

mountainriver

5 days ago

check the benchmarks or make one of your own

attemptone

5 days ago

I checked the BlEU-Score and Perplexity of popular models and both have stagnated around 2021. As a disclaimer this was a cursory check and I didn't dive into the details of how individuals scores were evaluated.

mountainriver

4 days ago

on what benchmarks? pretty much every major one is linear improvement

mrkurt

6 days ago

PLEASE write your response. We'll publish it on the Fly.io blog. Unedited. If you want.

ofjcihen

6 days ago

I’m uninterested in giving you content. In particular because of your past behavior.

Thanks for the offer though.

throwaway314155

6 days ago

> past behavior

Do go on.

tptacek

6 days ago

Kurt, how dare you.

ofjcihen

6 days ago

You wouldn’t happen to work for fly.io as well, would you?

Edit: Nm, thought I remembered your UN and see on your profile that you do.

grzm

6 days ago

Yes. And the author of the submission.

Mofpofjis

5 days ago

OMG I didn't notice. Way to burn down a huge amount of respect stemming from past cryptography work in just one blog post.

dogecoinbase

5 days ago

it's irksome to see tptacek having his patio11 moment for sure.

tptacek

5 days ago

What does this even mean? (I should be so lucky.)

kubb

6 days ago

Maybe make a video of how you're vibecoding a valuable project in an existing codebase, and how agents are saving you time by running your tools in a loop.

metaltyphoon

6 days ago

Seriously… thats the one thing I never see being posted? Is it because Agent mode will take 30-40 minutes to just bookstrap a project and create some file?

sensanaty

5 days ago

Well no, the reality of this workflow is the farcica, abject failures unleashed on the Dotnet codebase a week ago.

https://news.ycombinator.com/item?id=44050152

csallen

6 days ago

It takes like 2-6 minutes to do that, depending on the scope of the project

andrepd

6 days ago

So they can cherry pick the 1 out of 10 times that it actually performs in an impressive manner? That's the essence of most AI demos/"benchmarks" I've seen.

Testing for myself has always yielded unimpressive results. Maybe I'm just unlucky?

kubb

5 days ago

Livestream would be fair.

briandrupieski

6 days ago

> with poor arguments every week

This roughly matches my experience too, but I don't think it applies to this one. It has a few novel things that were new ideas to me and I'm glad I read it.

> I’m ready to write a boilerplate response because I already know what they’re going to say

If you have one that addresses what this one talks about I'd be interested in reading it.

slg

6 days ago

>> with poor arguments every week

>This roughly matches my experience too, but I don't think it applies to this one.

I'm not so sure. The argument that any good programming language would inherently eliminate the concern for hallucinations seems like a pretty weak argument to me.

ofjcihen

6 days ago

It’s a confusing one for sure.

To be honest I’m not sure where the logic for that claim comes from. Maybe an abundance of documentation is the assumption?

Either way, being dismissive of one of LLMs major flaws and blaming it on the language doesn’t seem like the way to make that argument.

simonw

6 days ago

Why does that seem weak to you?

It seems obviously true to me: code hallucinations are where the LLM outputs code with incorrect details - syntax errors, incorrect class methods, invalid imports etc.

If you have a strong linter in a loop those mistakes can be automatically detected and passed back into the LLM to get fixed.

Surely that's a solution to hallucinations?

It won't catch other types of logic error, but I would classify those as bugs, not hallucinations.

slg

6 days ago

>It won't catch other types of logic error, but I would classify those as bugs, not hallucinations.

Let's go a step further, the LLM can produce bug free code too if we just call the bugs "glitches".

You are making a purely arbitrary decision on how to classify an LLM's mistakes based on how easy it is to catch them, regardless of their severity or cause. But simply categorizing the mistakes in a different bucket doesn't make them any less of a problem.

layer8

6 days ago

I don’t see why an LLM wouldn’t hallucinate project requirements or semantic interface contracts. The only way you could escape that is by full-blown formal verification and specification.

ofjcihen

6 days ago

A good example of where a linter wouldn’t work is when the LLM has you import a package that doesn’t exist.

kubb

6 days ago

There's also the reverse genre: valid criticism of absolutely strawman arguments that nobody makes.

tptacek

6 days ago

Which of the arguments in this post hasn't occurred on HN in the past month or so?

sethev

5 days ago

Or in this very thread, for that matter.

kubb

5 days ago

"I tried copilot 2 years ago and I didn’t like it."

Great article BTW, it’s amazing that you’re now blaming developers smarter than you for lack of LLM adoption, as if it weren’t enough for the technology to be useful to become widespread.

Try to deal with „an agent takes 3 minutes to make a small transformation to my codebase and it takes me another 5 to figure out why it changed what it did only to realize that it was the wrong approach and redo it by hand, which took another 7 minutes” in your next one.

csallen

6 days ago

Can you direct me somewhere with superior counterarguments? I'm quite curious

calf

6 days ago

What valid AI criticisms? Most criticisms of AI are not very deep nor founded in complexity theoretic arguments, whereas Yann LeCun himself gave an excellent 1 slide explanation of the limits of LLMs. Most AI criticisms are low quality arguments.

therealpygon

6 days ago

“Valid” criticism rarely come from the people barely capable of understanding the difference between AI and LLMs, and using them interchangeably.

noobermin

6 days ago

Tbf the dismissal of the IP argument is a bit disappointing. Just because you don't respect IP doesn't mean we all do. Some of the code I'm proudest of I'm considering never open sourcing now because I simply don't trust online repositories anymore. Granted I'm in a bit of a different field than most developers but still, open source as a concept seems less attractive in the wake of tools that scrape it and copy it automatically without attribution. Count me out from ever sharing my code ever again.

I used to believe in the stereotypical copyleft anti IP ethos. In the modern day of corporate LLMs and disrespectful scraping, the lay of the ground is different and people understandably should be less inclined to open source their code.

Anamon

8 hours ago

This is also one of my predictions. LLM scrapers have made me add the "no derivatives" descriptor to the default CC license I publish all of my writing and side-projects under (for now, CC considers ND a valid legal defense against LLM scraping).

I still do stuff in the hopes of it benefitting anyone - but not parasitic oligarchs who are responsible for the imminent destruction of the world and society.

I don't know if a reasonable number of spare-time creators, authors, artists etc. feel similar about these things, but if they do, I'd expect a lot more content under non-free licenses in the future, that might've been published openly otherwise.

tom_m

5 days ago

I think people fail to realize that to value software to $0 is to destroy the point of most of it.

If it becomes so easy to write apps, then everyone can build their own. They won't need your app.

The entire point of most SaaS business revolves around the simple concept/question: build vs. buy.

So if there's no point in buying since building is so easy, then AI isn't helping you like your software product/SaaS.

I think programmers may just become the "IT person" or "webmaster" or "AI guy" at a company that needs software. Not at a separate company building it for others.

This may greatly reduce the need for programmers. This will in turn reduce the need for fancy computers like that expensive MacBook Pro. It will reduce the need for web hosting. Don't get me wrong, these things will still exist...but the demand will decrease.

Essentially it will become the biggest bubble since the dot com bust. It'll be much larger.

The dust will eventually settle and we'll all adapt to a new world with AI...but I think the years and years of over specialization could come to a screeching halt.

The industry will likely implode and suck in all of the orbiting industries along with it. I hope I'm wrong, but I don't think people quite understand here. I greatly suspect there will be far fewer sales of high end computers. Compute power will continue to move to data centers.

Think about it this way. Code aside, because it's not just a coding thing. If everyone can create amazing videos and full on films with AI, then they don't need all those high end machines. They don't need tons of hard drive space that takes all the raw footage and different cuts to make a video. You prompt and the get the exact result you need. Assuming AI gets that good.

So the more we push into AI the greater than "gravity" here so to speak and the stronger the implosion. I just see an entire industry collapsing in on itself personally. If I'm believing all the FOMO logic out there.

If "we'll see our first one person billion dollar company." Whoever said that is a moron. You understand what that means, right? If anyone can do all that, then how could that even exist? Are we all going to become billionaires now? No, to say something as stupid as that is to say software is trivial and worthless. So how could anyone start a software business worth so much?

nprateem

5 days ago

No.

The issue is maintenance and focus. Sure, smash out an AI accounting app, payroll app, CRM, etc but now you're not in the business of selling hamburgers or whatever, you've become a conglomerate.

nomilk

6 days ago

> "but you have no idea what the code is"

This identifies a critical lever: it doesn't matter that AI sometimes makes mistakes, hallucinates, makes poor architecture decisions. All that matters is your PR to master, because that's what you always have been and will continue to be judged on. Merging trash? That's on you! Failing to utilise new tools and taking 10x longer to make the feature? Also on you!

bigbuppo

6 days ago

Ah yes, because it was garbage six months ago, all that was thrown away and now they latched on to the "agentic" thing. At best this is pre-alpha-quality software not fit for production use. After all, it has been less than six months. Next it's going to be the post-agentic world, and then all the companies are going to go bankrupt once the bills are actually due. Any company that pivoted to video, err, AI is also going to go bankrupt, while everyone they laid off is going to build the next big thing and they'll live the heads of the AI charlatans on pikes outside of town to warn off any AI stragglers.

DebtDeflation

6 days ago

> Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania

It's not that it's a fad. It's that the hype has gotten way ahead of the capability. CEOs laying off double digit percentages of their workforce because they believe that in 6 months AI will actually be able to do all those jobs and they want to get the message out to Wall St to juice the stock price today.

sanderjd

6 days ago

Both things can be true, and in my view, they are. I think there is a lot of "there" there with these tools, and increasingly so, and also that lots of people are out over their skis with the hype.

The key is to learn the useful tools and techniques while remaining realistic and open-eyed about their limitations.

DebtDeflation

5 days ago

That's why I said AI is not a fad. It's everywhere and is providing real utility. It's just not going to let CEOs replace the majority of their workforce like they think it will.

sanderjd

5 days ago

Yeah. I'm not sure how many of them think this. I think they mostly are paying attention, making sure their businesses aren't falling behind competitors, but also hedging their bets.

diabllicseagull

5 days ago

things are way beyond hype at this point and calling skeptics who raised the alarms along the way nuts is nuts. I'll also leave this here: https://www.youtube.com/watch?v=6ovuMoW2EGk

lerp-io

5 days ago

ai would be great if only the problems I was trying to solve were the right ones

K0balt

5 days ago

I’d love to know your tool stack?

OhMeadhbh

5 days ago

people do not believe AI is fad because of NFT mania. we believe LLMs are a fad because they're over-hyped and they under-deliver. just my $0.02. note that i didn't say "AI is Bullshit." but I will say some of the AI hype is bullshit. and one thing i've noticed is there are so many caveats for making accurate statements about LLMs, it's hard to make short statements that aren't wrong (unless you have all those exceptions and caveats.)

come at me.

Mofpofjis

6 days ago

> A heartfelt provocation

okay then (profanity begets profanity)

> Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue!

Do you /intend/ to spend /most/ of your work life reading AI puke? If so: sanity issue! You are advocating for replacing creative action that gives you Flow with reaction that gives you hemorrhoids, as the /bulk of the work/. Fuck right off.

> How are you handling the chaos human developers turn out on a deadline?

By correcting the deadline setting methodology for the next round of development.

> Does an intern cost $20/month? Because that’s what Cursor.ai costs

then rely on Cursor.ai to raise your next-gen seniors who are going to be supposed to review AI puke in the future

> but the craft [...] Do it on your own time

Go choke on a bag of dicks. You are the problem. This is why quality no longer exists in the world, only lowest common denominator consumerist shit. This is why cars, home appliances, and tools have been getting both worse and more user-hostile. Because nobody values the quality of their own creative output anymore. Quality should not be an exception (which makes it very expensive), but the standard. Quality should be a near-commodity, and therefore at least occasionally affordable for anyone.

> I buy a fucking table

A ridiculously low quality, but hey, cheap!, table, that will fail you at the most inopportune moment. IKEA does have its place, but it's probably not in your office where you receive clients. Do you want them to see the table top fly off?

> They almost certainly have a bigger bag of algorithmic tricks than you do: radix tries, topological sorts, graph reductions, and LDPC codes

and you'll have to review that bag of algorithmic tricks embedded in AI puke, with potential hallucinations introducing obscure, critical mathematical bugs. Good luck verifying all that as a senior, based on proofs and first principles. A human coder will either use a well-reviewed library, or cite a textbook, and implement the textbook very closely, so you can follow it, and have some trust in the intent at least.

> No profession has demonstrated more contempt for intellectual property

how disingenuous. you are showing contempt for copyright. You are totally missing the point of copyleft. Copyleft doesn't advocate for anarchy, it advocates for user freedom via the software copyright framework. When AI barfs GPL'd code snippets into a proprietary code base, that harms user freedom.

> and watched it in seconds spot LVM metadata corruption issues

you are drifting; this is not code generation. Fuzzy pattern matching has always been fair game.

kyle-rb

5 days ago

>> but you have no idea what the code is

> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?

> You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.

Loved this bit. In a professional setting, any code — AI-written or not — should be read by at least two people, the submitter and the reviewer.

throwawaybob420

5 days ago

Another day another post about AI by people whose job will be made redundant.

Something fascinating about this HN crowd in particular is its lack of foresight. Companies producing medicine have to take into account longterm effects, yet people working in these “tech” companies push out AI after AI with not a single care in how it’s going to impact people.

simonw

5 days ago

The more time I spend using AI-assisted programming tools the less worried I am that they're going to take my job.

Not because they aren't effective - they are - but because using them helps me appreciate how important my own skills and experience are for using them to effectively solve problems.

peterjliu

5 days ago

interesting are LLMs a lot better at Go than Rust?

thunderbong

5 days ago

> We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.

Very true and humbling.

are_you_sure

5 days ago

I'm not a skeptic because they can't produce code or whatever. Sure they are getting better at it.

I'm a skeptic because the megacorps have decide ai slop is the future and its going to push us deeper into hypercapitalist dystopia.

Programming as a career is going to end up as a ai slop janny. You aren't going to be paid more if your job is easier or whatever, you're gonna get paid less (and then end up doing more atomised work) because "an ai can do it". That's the direction that everything seems to be heading.

starkparker

5 days ago

I'm more interested than skeptical, but I'd take 500 simonw posts over another one of these "shove this concern up your ass" ones.

groestl

5 days ago

I find it curious that nobody in this whole thread does once mention AI communism. I think after much pain and sorrow, and many deaths, this is the most likely outcome. Probably in a dystopian fashion.

mkfs

5 days ago

This is pretty disingenuous and mean-spirited. He's using SV startup types to smear all developers, including the typical, humble, un- or undercompensated FOSS dev who absolutely respects intellectual property, and quite reasonably, expects their own IP to be respected as well. But because Sam Altman types follow the "It's Better to Beg for Forgiveness than to Ask for Permission" rule, altruistic, non-sociopathic developers have no right to complain?

tomnipotent

5 days ago

> non-sociopathic developers have no right to complain

The very same developers that advocate pirating from Netflix or Disney?

He's pointing out the hypocrisy in a class of people (generally, tech-savvy programmers) that have no problem advocating for piracy and gleefully thwarting the concrete IP rights of businesses on one hand, while defending and bemoaning the abstract IP rights of other content creators on the other hand.

My take away is that he's saying IP rights is a weak argument to use to defend a personal position for not adopting LLM's in your workflow, especially given the context of modern piracy.

mkfs

5 days ago

> The very same developers that advocate pirating from Netflix or Disney?

Even the ones advocating or engaging in personal piracy aren't creating unauthorized derivative works and monetizing them, which is arguably what these AI systems and their customers are doing. And Thomas wasn't talking about personal piracy, but instead trying to use the IP-hostile conduct of Google and certain startups (like Scribd or GrooveShark) to smear all developers and de-legitimize their IP-based objections, even of those releasing code under permissive licenses (like the GPL) in exchange for expected adherence to certain terms and conditions that these systems and their users ignore.

tomnipotent

5 days ago

> but instead trying to use the IP-hostile conduct of Google and certain startups (like Scribd or GrooveShark) to smear all developers

I challenge you to point out the language in his post making this claim. Otherwise you're just making stuff up.

He's specifically addressing high-level arguments used against LLM's, and this case he's calling out the fact that one group of people are railing against LLM providers for "IP theft" while simultaneously encouraging and using pirated content. It's perfectly reasonable to call out the hypocrisy in these contradictory positions, to demonstrate that the "IP theft" argument is mostly virtue signaling.

mkfs

4 days ago

> I challenge you to point out the language in his post making this claim. Otherwise you're just making stuff up.

Here:

> The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.

He doesn't explicitly name corporate entities, but it's clearly the ones founded and run by "move fast, break things" types (and I suspect he's one of them) that he's using to smear all developers, even the humble ones who aren't nearly as rich as him and whose open source software he used to build his business.

tomnipotent

3 days ago

That quote doesn't even remotely align with your previous comment. You have to do some serious mental gymnastics and intentionally interpret with bad intentions to reach the conclusion you did.

> but it's clearly the ones founded

Clearly in your imagination, since you're just putting words in his mouth.

XiphiasX

6 days ago

All good. They will suffer, and you will laugh.

ImaCake

5 days ago

>My AI skeptic friends are all nuts

This is dead right; my AI skeptic friend are ideologically opposed to LLMs and GenAI.

Their arguments have nothing to do with how good or bad LLMs are:

- IP issues (for code in a jupyter notebook which will never face IP scrutiny)

- because it exploits slaves and poor people (but so does the computer they use and the clothes they wear).

I also have a few programmer friends who make more legitimate complaints about code quality but because they are friends there is also an obvious subtext of ideological opposition.

My opinion is you should use LLMs anyway; you can hate "capitalism" and still take advantage of the system.

20k

5 days ago

>All progress on LLMs could halt today, and LLMs would remain the 2nd most important thing to happen over the course of my career.

During my time as a programmer (gaming + astrophysics), the following things have happened:

1. The rise of GPGPU programming, which has enabled 1000x increases in performance

2. A revolution in CPU architecture, from single core, to massively multicore

3. C++98 - C++26

4. Transformational increases in CPU memory

5. (In the graphics space): dedicated raytracing hardware, the fully general purpose GPU + OpenCL/CUDA, deferred rendering, FSR/DLSS

6. Mobile phones were invented

7. 64-bit

8. The internet went from being for MSN and facebook to what it is today

9. The rise of Valve/Steam and the incredible transformational existence of the indie games market, which enables individuals to self publish their code and products for virtually nothing

10. Clang was born, and GCC came back to life. MSVC became standards compliant, finally

11. Rust was invented, and is taking off. People actually take security seriously now (!)

12. Unified CPU/GPU memory spaces, GPU architectures becoming scalar etc

All of these have had vastly more impact on my programming than LLM's ever had. I've always a bit wondered who people are who find it transformational, because I can vaguely gesture at any part of my programming tooling and find that its vastly different to 10 years ago

>interact with Git, run existing tooling, like linters, formatters, and model checkers, and make essentially arbitrary tool calls (that you set up) through MCP.

I mean.. you can just do all these things anyway. Its literally more work for me to use an LLM to run a linter than to run a linter (its built into my IDE). So's git. And formatters as well. You can also make arbitrary tool calls.. through your ide/scripts/precommit/postcommit/etc. I have no idea why you'd even want an LLM to do this!

>People complain about LLM-generated code being “probabilistic”. No it isn’t

>The LLM might be stochastic

????????? That's literally the entire point. I want deterministic answers, like clang based autocomplete instead of the nightmare that is prob- i mean stochastic autocomplete

>If hallucination matters to you, your programming language has let you down.

They matter beacuse they waste your time with bullshit that you then have to fix. No programming language can truly enforce correct logic constraints which are one of the primary difficulties with writing code. I literally have no idea what the OP is even doing, actually writing code has always been a vastly minimal amount of time - the main bottleneck is the train of thought to make sure that everything's correct. The literal typing is an afterthought. No programming language can bypass that step, they can (at best) handhold you through certain kinds of problems

>Does an intern cost $20/month? Because that’s what Cursor.ai costs.

For the moment. They'll jack up the prices while enshittifying, and then good luck. I do not want to subscribe to a service to be able to code. I use free IDEs and free tools for exactly this reason. If you rely on a paid-for 3rd party you are doing it wrong and will regret it. This is one of the silliest things in the whole article

>But people select languages in part based on how well LLMs work with them, so Rust people should get on that

They really don't. People select languages based on:

1. What they're taught at school/university

2. What language meets their domain requirements

3. What language their job requires

>Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

Are you joking. Is this article a troll? Programmers give a lot of a crap about copyright law. GPL/MIT/etc are hugely important, and people respect the ever loving crap about it in general - even most major companies won't just blatantly pirate GPL works

There's a tonne more in this article, but it smells like someone who has literally no idea what they're talking about talking out of their arse, and it shows profoundly

pona-a

6 days ago

> It’s projection. People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough! But people select languages in part based on how well LLMs work with them, so Rust people should get on that.

How is it the responsibility of the Rust community that there weren't enough metric tons of free code for the machine to slurp up? And the phrasing makes it sound like it's the community's fault for not feeding OpenAI enough code to be stripped of its license and authorship and get blended into a fine latent soup. It's a lot like people coming to a one-man FOSS project with a laundry list of demands, expecting to be treated with the religious reverence of a major enterprise contract.

The whole tone, the pervasive "use it or you'll be left behind"—where users saying they don't want or need it only proves further evidence of its imminent apotheosis—superficially reminds me of previous FUDs.

And how is it not concerning that the thing described as intelligent needs billions of lines to generalize a language a human can learn from a single manual? Will it need hundreds of kLOC to internalize a new library, or even its new version, beyond in-context learning? The answer is yes; you are choosing to freeze the entire tech stack, when fixing its abstractions could actually save you from boilerplate, just so the machine can write it for you at $200 a month with a significant error rate.

benmmurphy

5 days ago

Is the problem with LLMs and Rust really a problem with the training data or a problem with it being harder to get something to compile in Rust? I presume its easier to get something to compile in the dynamically typed languages, and then slightly more difficult in statically typed languages and then a step up with Rust which also forces you to deal with the lifetime of objects. But maybe with LLMs its different and they can do static typing easier than dynamic but they just really struggle with structuring a program to properly deal with ownership.

nickpsecurity

5 days ago

They mostly copy and mix patterns in the training data. Lots of repetition with variations on them is helpful for their generalization. Languages like Python and Go have a ton of code in Github, etc like that. I saw that using Python with GPT 3.5/4.

If it's a rarer language, the math doesn't do as good of a job on piles of random code. There's just not enough for it to learn from. I cant speak for Rust since I dont know the numbers but imagine it's much less than Python or Go.

I have seen some evidence, though, that harder languages are harder for them to code in. GPT 3.5 used to struggle with C++ for something that it could easily produce in Python. It could actually produce things in C more easily than C++. It makes sense, though, because there's both more context needed for correctness and more behavioral patterns to write it.

My solution, which I only prototyped in GPT due to leaving AI, was to use AI's to write code in languages like Python which non-AI tools transpiled to high-performance code in C++ or Rust. Think the Python to C++ compiler or maybe Nikita. Later, with hallucinations mitigated enough, add LLM's to those transpilers.

As a side benefit, it let's you sell a product accelerating or increasing predictability of applications in that language. That's a non-AI investment. There's a few companies doing that, too. So, one could sell to the AI crowd, the "language X in business" crowd, or both.

gar1t

2 days ago

I code to understand. That's it. That's all. My code is a reflection of a problem definition and a solution. The problem may be under- or poorly defined and the solution may be lousy but by God it's there as a clear reference to be further reasoned about.

Programmers who don't work to deepen their understanding? Nah, sorry, not programmers I want anywhere near the code I'm working on.

To the extent that an LLM can help deepen understanding, good! Use it! Use its code. But you use it according to your intentional, systematic understanding of what the you're doing and why. And that understanding is not a function of code generation or, worse, code churn for the sake of feeling like you're going fast and paying less.

> LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious.

I've been writing code for over 30 years and I can't recall ever writing "tedious code". I don't know what this even means. Code, by definition, should not be tedious. At any layer of the stack, code should solve non-tedious problems. The whole point of code is to avoid tedium.

Most code, most projects? I haven't seen this and doubt it's even close to true - if it is God help us. If you write tedious code, find another place to work or another career.

> Sometimes, gnarly stuff needs doing. But you don’t wanna do it. So you refactor unit tests, soothing yourself with the lie that you’re doing real work.

Seriously who kills time by refactoring unit tests? Is this projection by the author?

> But an LLM can be told to go refactor all your unit tests.

Wtaf?? The unit tests are what stands between the blind ambitions of a statistical model and correct code. Handing over tests to an LLM? One staggers.

Tests represent part of that problem definition and solution. They're integral to the code. They foster and deepen understanding - not to be subject to the musings of statistical process, unless those musings serve as a dialectic.

> If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?

Answer: Slow the F down! Chaos and programming don't belong. Managers who drive their teams like this are bad managers.

> If hallucination matters to you, your programming language has let you down.

The author seems to think that programming is about producing code. Producing code that compiles is not a problem. Producing correct and coherent solutions to problems that need to be solved is a problem. Hallucinations aside, how will you know that generated code is something to keep or reject unless you have some understanding of what the code should do before you see it?

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

Squee! Code for all my friends!

> but it’s bad at rust

No, it's bad at understanding. And so are humans, initially. But it's our job to build and deepen understanding and we use code to facilitate that.

> Professional software developers are in the business of solving practical problems for people with code.

Correct.

> [LLMs] devour schlep, and clear a path to the important stuff, where your judgement and values really matter.

This begs the question, who defines what's important? It sounds like the author agrees that it's us, not the model. I agree this is a useful method. It's like any code review - it sharpens the reader no matter the code quality.

> As a mid-late career coder, I’ve come to appreciate mediocrity.

How cool and nonchalant what a rebel he says the opposite of everyone what a baddie!

I think the author means "appreciate naivety", meaning, a first pass that's good enough is good enough - there's no need to go further. I'll concede this, but it's a concession and not something to boast about. Odds are decent you'll come back to this naive implementation and revise it. Yes, at a certain point there's diminishing returns but good programmers know when to invest in revision and when to move on.

> LLMs really might displace many software developers.

We've seen this cycle. Technology and globalization puts pressure on programmers. There are legions of programmers in parts of the world that will generate code for $20 an hour. You can pay $20 and hour - or $20 a month - and generate mountains of code that blows chunks - not because it's ugly but because it's incoherent or because it solves the wrong problem. We. have. seen. this. It's why Silicon Valley still pays massive salaries to good programmers.

> The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality.

Time will tell. In the meantime I'll keep investing in understanding, not code, and see how things shake out.

PeterHolzwarth

5 days ago

Wow, I'm surprised this post hasn't gotten hit by the HN flame-war detector: We have threads downvoted into forced-collapsed that are headed by completely innocuous comments. It really is strange how people here are not able to accept different viewpoints in a discussion: I've three times now returned to the comments to just blanket-upvote everything that's starting to turn grey.

zephyrfalcon

5 days ago

This is satire, right?

...Right?

keybored

6 days ago

Thankfully the uncrazy person is going to get us on that sane VC AI wavelength.

> If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other.

They’re playing 3D chess while you’re stuck at checkers.

I do things suboptimally while learning the ropes or just doing things casually. That doesn’t mean that I judge the approach itself by my sloppy workflow. I’m able to make inferences about what a serious/experienced person would do. And it wouldn’t involve pasting things through three windows like I would do.

So of course I don’t judge AI by “ask chatty and paste the response”.

Yes indeed: “deploying agents” is what I would imagine the Ask Chatty And Paste workflow taken to Perfection to look like.

Most Rube Goldberg machines are very tedious and consist of fifty-too-many parts. But we can automate most of that for you—

I could not have ever imagined a more Flintstones meets Science Fiction clash than AI According To Software Engineers. You’re using AI to generate code. And no one cares how much. It’s just so tedious in any case.

A wortwhile approach would have been to aspire to make or generate technology artifacts that could be hidden behind a black box surface with a legible interface in front. Is the code tedious? Then make the AI come up with something that is well-designed, where the obvious things you want is given freely, where minor customizations are just minor tweaks, and larger deviations require only proportionally larger changes. Uh, how about no? How about generating 20KLOC line “starter” some-framework project with all the 20KLOC “tedious” bits hanging out, then we can iterate from there. The AI made a Git log and everything so it’s ya know audited.

But maybe I’m being unfair. Maybe we are moving towards something not quite as stupid as Deploy ChatGPT 50X? Or maybe it’s effectively going to behind a black box. Because ya know the AI will deal with it all by itself?

> Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?

> You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.

No!, and what the fuck is wrong with you? We are Flintstone technologists and I’ll be damned if I can’t get my AI brain chip-injected, genetically enhanced for speed horsey cyborg for my modern horse-drawn carriage patent.

skor

5 days ago

well, I'm going to reference [1] a comment on the "Who Cares Era" that really stood out and maybe should get pinned somewhere.

What about that "superpower" we had to make something out of _nothing_ but our own brains? Sure, you may have looked up some code (do some research and learn), but that mental exercise.. just knowing your code inside-out, can never be compared to the "oh look at that, it actually works" with LLM stuff.

Well guess what, now, you traded that superpower for a subscription! (yes, you are now a middle man of software peddling - paying some company that stole other peoples code and ideas and did not pay _them anything_ in return), and eventually you will get to a point when the internet is down, you won't write any software that day cause you cant get the N model to work.

Did we go to the next step in software evolution? If it has a cost per month (and you pay thieves) (yes that what they are), its going to be a hard "no" for me, you just became a cow. Sure, now this is whats expected of you? OK. This has some strong neoliberalism drift, I assume I can be wrong so please comments welcome I am quite interested in this part.

yes, augmentations seem like amputations, is it McLuhan? I'm not sure, Will we retain our power to produce software? No! Humans dumb down very quickly, a generation is all it took to reduce reading to kids below expected educational levels [2]. There will be less professionals and lots of peddlers that have no skills, and software will be so junk it will be seriously scary to catch a plane in 10 years time.

Interesting times.

[1] https://news.ycombinator.com/item?id=44116724 [2] https://www.theguardian.com/lifeandstyle/2025/jun/02/gen-z-p...

jurgenaut23

5 days ago

Something that I feel is missed in every such post or debate is that, maybe, this isn’t a yes/no question.

Fast forward 10 years, the skeptics will be right on a few things, and the fanboys will be right on other things.

I for one know this isn’t just a “fad” like NFTs were, but I cannot help but feel skeptical about the sweeping claims of software engineering’s doomsday.

kakadu

6 days ago

A hammer hammers.

It hammers 100% of the time, with no failure.

It requires the same amount of labour from my part but it delivers the same outcome every time.

That is what tools do, they act as an extension and allow you to do things not easily done otherwise.

If the hammer sometimes hammers, sometimes squeaks and sometimes screws then it requires extra labour from my part just to make it do what purpose specific tools do, and that is where frustrations arise.

Make it do one thing excellent and we talk then.

IshKebab

6 days ago

This is the kind of non-serious argument he's talking about. There are plenty of tools that require supervision to get good results. That doesn't make them useless.

My 3D printer sometimes prints and sometimes makes spaghetti. Still useful.

threeseed

6 days ago

They never said it was useless. You just invented that straw man in your head.

3D printing is largely used for prototyping where its lossy output is fine. But using it for production use cases requires fine tuning it can be 99.9% reliable. Unfortunately we can't do that for LLMs hence why it's still only suitable for prototyping.

Philpax

6 days ago

But you can adjust the output of a LLM and still come out ahead in both time and mental effort than writing it by hand. Unlike a 3D printer, it doesn't have to be right the first time around to still be useful.

bigstrat2003

6 days ago

> But you can adjust the output of a LLM and still come out ahead in both time and mental effort than writing it by hand.

No you can't, or at least I can't. LLMs are more work than just doing it by hand.

bawolff

6 days ago

There is a big difference between "not entirely useless" and best tool for the job.

okanat

6 days ago

You don't use 3D printing to do large-scale production. If you agree that AI should only be used in prototype code and nothing else, then your argument makes sense.

tpmoney

5 days ago

Depending on your definition of "large-scale production" Prusa famously 3d prints a number of components in their production 3d printers.

rxtexit

5 days ago

IMO this is exactly the wrong mental model.

You can't hammer a nail a 1000 times and pick the best hammered nail.

You can have the hammer iterating over the structure 24/7, finding imperfections in previous hammered nails.

This is imposing arbitrary constraints on the model and that when you give a human just a hammer, they tend to start to view everything like a nail.

mrbungie

6 days ago

You know what's nuts? How so many articles about supporting LLMs and against skeptics are so full of fallacies and logical inconsistencies like strawmans, false dichotomies, appeals to emotion and to authority when they have supposedly almost AGI machines to assist them in their writing. They could at least do a "please take a look at my article and see if I'm commiting any logical fallacies" prompt iteration session if they trust these tools so much.

These kinds of articles that heavily support LLM usage in programming seem to FOMO you or at least suggest that "you are using it wrong" in a weak way just to invalidate contrary or conservative opinions out of the discussion. These are pure rhetorics with such an empty discourse.

I use these tools everyday and every hour in strange loops (between at least Cursor, ChatGPT and now Gemini) because I do see some value in them, even if only to simulate a peer or rubber duck to discuss ideas with. They are extremely useful to me due to my ADHD and because they actually support me through my executive disfunction and analysis paralysis even if they produce shitty code.

Yet I'm still an AI skeptic because I've seen enough failure modes in my daily usage. I do not know how to feel when faced with these ideas because I feel out of the false dichotomy (pay for them, use them every day, but won't think them as valuable as the average AI bro). What's funny is that I'm yet to see an article that actually shows LLMs strengths and weaknesses in a serious manner and with actual examples. If you are going to defend a position, do it seriously ffs.

dbalatero

5 days ago

Just for fun, I asked ChatGPT and it came up with 30+ fallacies + examples. I'm sure some is hallucinated, but woof:

https://chatgpt.com/share/683e62ed-e118-800f-a404-bd49bec799...

chinchilla2020

5 days ago

As I said in another post. The article is pure rhetoric. It provides no actual numbers, measurements, or examples.

It's just "AI did stuff really good for me" as the proof that AI works

morning-coffee

6 days ago

Check, please.

mlsu

5 days ago

I tried the agent thing on:

- Large C codebase (new feature and bugfix)

- Small rust codebase (new feature)

- Brand new greenfield frontend for an in-spec and documented openAPI API

- Small fixes to an existing frontend

It failed _dramatically_ in all cases. Maybe I'm using this thing wrong but it is devin-level fail. Gets diffs wrong. Passes phantom arguments to tools. Screws up basic features. Pulls in hundreds of line changes on unrelated files to refactor. Refactors again and again, over itself, partially, so that the uncompleted boneyard of an old refactor sits in the codebase like a skeleton (those tokens are also sent up to the model).

It genuinely makes an insane, horrible, spaghetti MESS of the codebase. Any codebase. I expected it to be good at svelte and solidJS since those are popular javascript frameworks with lots of training data. Nope, it's bad. This was a few days ago, Claude 4. Seriously, seriously people what am I missing here with this agents thing. They are such gluttonous eaters of tokens that I'm beginning to think these agent posts are paid advertising.

vitaflo

5 days ago

It’s entirely possible that the people talking up agents also produced spaghetti code but don’t care because they are so much more “productive”.

An interesting thing about many of these types of posts is they never actually detail the tools they use and how they use them to achieve their results. It shouldn’t even be that hard for them to do, they could just have their agent do it for them.

phkahler

5 days ago

>> It’s entirely possible that the people talking up agents also produced spaghetti code but don’t care because they are so much more “productive”.

You may be right. The author of this one even says if you spend time prettying your code you should stop yak shaving. They apparently don't care about code quality.

blibble

5 days ago

> You may be right. The author of this one even says if you spend time prettying your code you should stop yak shaving. They apparently don't care about code quality.

brought to you by fly.io, where the corporate blog literally tells you to shove your concerns up your ass:

> Cut me a little slack as I ask you to shove this concern up your ass.

lumost

5 days ago

The agent/model being used makes a huge difference. Cline with Claude 3.7 is ridiculously expensive but useful. Copilot is vaguely ok.

galangalalgol

5 days ago

Even just doing the cut and paste thing for one shots, claude sonnet 4 writes good rust code, generally on the first try.

runjake

5 days ago

You’re not providing a key piece of information to provide you with an answer: what were the prompts you used? You can share your sessions via URL.

A prompt like “Write a $x program that does $y” is generally going to produce some pretty poor code. You generally want to include a lot of details and desires in your prompt. And include something like “Ask clarifying questions until you can provide a good solution”.

A lot of the people who complain about poor code generation use poor prompting.

phkahler

5 days ago

It'd be nice if the AI advocates shared prompts, or even recorded entire sessions. Then we could all see how great it really is.

runjake

5 days ago

They do. All over the place.

Simon Willison has some great examples in his blog and on his GitHub. Check out Karpathy’s YouTube videos as well.

https://simonwillison.net/

jonas21

5 days ago

This one's been on the HN front page all day:

https://news.ycombinator.com/item?id=44159166

hermanradtke

5 days ago

Many do. In their GitHub repos right next to the code.

foobarian

5 days ago

"Infrastructure as Prompts" (IaP)

lexandstuff

5 days ago

Prompt engineering isn't really that important anymore imo. If you're using a reasoning model, you can see if it understood your request by reading the reasoning trace.

simonw

5 days ago

Disagree, effective prompting remains a crucial skill that's difficult to acquire.

I've been developing my prompting skills for nearly three years now and I still constantly find new and better ways to prompt.

I also consider knowing what "use a reasoning model" means to be part of that skill!

jacob019

5 days ago

That's a very dangerous thought. Prompt engineering evolved is just clear and direct communication. That's a hard thing to get right when talking to people. Heck, personally I can have a hard time with clear and coherent internal dialog. When I am working with models and encounter unexpected results, it often boils down to the model giving me what I asked for instead of what I want. I've never met anyone who always knows exactly what they want and is able to articulate it with perfect clarity. Some of the models are surprisingly good at figuring out intent, but complexity inevitably requires additional context. Whether you are working with a model or a person, or even your future self, you must spend time developing and articulating clear specifications, that is prompt engineering. Furthermore, models don't "think" like people--there's technique in how you struture specifications for optimal results.

lexandstuff

5 days ago

Fair enough. I guess I was mainly thinking of how rarely I need to utilise the old prompt engineering techniques. Stuff like: "You are an expert software developer...", "you must do this or people will die." etc

I just tell the AI what I want, with sufficient context. Then, I check the reasoning trace to check it understood what I wanted. You need to be clear in your prompts, sure, but I don't really see it as "prompt engineering" any more.

runjake

5 days ago

Yeah I strongly disagree with that. I think prompts are very critical.

As with any other project, it’s best to specify your wants and needs than to let someone or an LLM to guess.

phkahler

5 days ago

There a many ways to do something wrong and few ways to do them right. It's on the AI advocates to show us session logs so we can all see how it's done right.

chucknthem

5 days ago

How are you writing your prompts? I usually break a feature down to smaller task level before I prompt an agent (claude code in my case) to do anything. Feature level is often too hard to prompt and specify in enough detail for it to get right.

So I'd say claude 4 agents today are at smart but fresh intern level of autonomy. You still have to do the high level planning and task break down, but it can execute on tasks (say requiring 10 - 200 lines of code excluding tests). Any asking it to write much more code (200+ lines) often require a lot of follow ups and disappointment.

jvanderbot

5 days ago

presentation

5 days ago

I have not had a ton of success getting good results out of LLMs but this feels like a UX problem. If there’s an effective way to frame a prompt why don’t we get a guided form instead of a single chat box input?

presentation

5 days ago

Coding agents should take you through a questionnaire before working. Break down what you are asking for into chunks, point me to key files that are important for this change, etc etc. I feel like a bit of extra prompting would help a lot of people get much better results rather than expecting people to know the arcane art of proompting just by looking at a chat input.

consumer451

5 days ago

I am just a muggle, but I have been using Windsurf for months and this is the only way for me to end up with working code.

A significant portion of my prompts are writing and reading from .md files, which plan and document the progress.

When I start a new feature, it begins with: We need to add a new feature X that does ABC, create a .md in /docs to plan this feature. Ask me questions to help scope the feature.

I then manually edit the feature-x.md file, and only then tell the tool to implement it.

Also, after any major change, I say: Add this to docs/current_app_understanding.md.

Every single chat starts with: Read docs/current_app_understanding.md to get up to speed.

The really cool side benefit here is that I end up with solid docs, which I admittedly would have never created in the past.

cyral

5 days ago

You can ask it to do this, in your initial prompt encourage it to ask questions before implementing if it is unsure. Certain models like o4 seem to do this more by default rather than Claude that tends to try to do everything without clarifying

jaggederest

5 days ago

I mean if you ask Claude code to walk through what you should do next with you it'll ask lots of great questions and write you a great TODO.md file that it'll then walk down and check the boxes on.

You don't exactly need to know prompting, you just need to know how to ask the AI to help you prompt it.

Affric

5 days ago

I feel like when you prompt an LLM the LLM should take it almost as "what would the best possible prompt for this prompt be and then do that"...

jacob019

5 days ago

I don't think it's fair to call that the agent thing. I've had profoundly positive results with agentic workflows for classification, analysis, and various business automations, including direct product pricing. You have to build an environment for the agent to make decisions in, with good instructions for what you want them to do. Then you wire it up so that the decisions have effects in the real world. You can acheieve really good results, and there is a lot of flexibility to tweak it and various tricks to optimize performance. Tools can allow agents to pull in relevant context as needed, or to execute complex multistep workflows. That is the agent thing.

Writing code is one thing that models can do when wired properly, and you can get a powerful productivity boost, but wielding the tools well is a skill of it's own, and results will vary by task, with each model having unique strengths. The most important skill is understanding the limitations.

Based on your task descriptions and the implied expectation, I'm unsurprised that you are frustrated with the results. For good results with anything requiring architecture decisions have a discussion with the model about architecture design, before diving in. Come up with a step by step plan and work through it together. Models are not like people, they know everything and nothing.

turtlebits

5 days ago

Have it make small changes. Restrict it to a single file and scope it to <50 lines or so. Enough that you can easily digest without making it a chore.

declan_roberts

5 days ago

A small change scoped to <50 lines is something easy to write for a normal software engineer. When do the LLMs start doing the hard part?

skydhash

5 days ago

A small change around 50 lines is the size of an advent of code solution (the hardest part). Most of the code you write around that is for defensive coding (error handling, malformed input, expected output,…) which is the other hard part. Then you connect these cores to form a system and that’s another tough problem. And it needs to evolve to.

We’ve built tools to help us with the first part, framework with the second, architecture principles with the third and software engineering techniques for the fourth. Where do LLMs help?

kasey_junk

5 days ago

When you wire them up to your cicd process with pull requests and the github gui as your interface, rather than sitting there passively riding along as it prompts you the changes it’s going to make.

With my async agent I do not care about how easy it is for me, it’s easier to tell the agent to do the workflow and comeback to it later when I’m ready to review it. If it’s a good change I approve the pr, if not I close it.

simonw

5 days ago

Can you type that 50 line change in less than 15 seconds?

soraminazuki

4 days ago

It'd take less than the time it takes to come up with The Perfect Prompt.

citizenpaul

5 days ago

>these agent posts are paid advertising.

I'm 100% certain most if not all of them are, there is simply too much money flying around and I've seen things that marketing does in the past for way less hyped products. Though in this specific case I think the writer may simply be shilling AI to create demand for their service. Pay us monthly to one click deploy your broken incomplete AI slop. The app doesn't work? No problem just keep prompting harder and paying us more to host/build/test/deploy it...

I've also tried the agent thing and still am with only moderate success. Cursor, Claud-squad, goose, dagger AI agents. In other words all the new hotness, all with various features claiming to solve the fact that agents don't work. Guess what? they still don't.

But hey this is HN? most of the posters are tech fearing luddies right? All the contention on here must mean our grindset is wrong and we are not prompting hard enough.

There is even one shill Ghuntly that claims you need to be "redlining" ai at the cost of $500-$1000 per day to get the full benefits. LOL if that is not a veiled advertisement I don't know what is.

simonw

5 days ago

Nobody has paid me to write anything about AI and if they did I would disclose it.

ghuntley

5 days ago

ditto - https://ghuntley.com/disclosures

citizenpaul

5 days ago

Forgive me if I don't consider your personal blog an authority of your honesty. "I'm not a liar, for real dude see it says right there"

Why are all your public projects "joke/toy projects" if AI is so awesome and production ready? My experience reflects this as the truth. Yet your work backs up my experience rather than your words.

To avoid only being snark. I think all software is about power/control and software has allowed unprecedented concentration of power Which is why it resists being official like other industries. No one with power wants restrictions on their power/software. Ultimately AI is good for small'ish projects and a productivity multiplier(eventually). I think it will lead a new revolution in distilling the current incumbents in the business world that are stagnant on vast proprietary software systems that previously could not be unseated. Small players will be able to codify/automate their business to make it competitive with big players. So I'm not "anti-ai".

edit: AI will simultaneously rot the proprietary software advantage from the inside out. As companies are further convinced that AI can solve their problem of having to pay people to maintain their software.

simonw

5 days ago

I think it's pretty counter-productive to default to not trusting anyone under any circumstances.

Having a "disclosures" page on a personal website is a pretty strong quality signal for me - it's inspired me to set a task to add my own.

As with all of these things, the trick is to establish credibility over time. I've been following Geoff for a few years. Given his previous work I think he has integrity and I'm ready to believe his disclosures page.

citizenpaul

5 days ago

I get what your saying.

However we seem to live in a time where integrity is basically valued at zero or more commonly as something to "bank" so you can cash in for an enormous payoff when the time comes. I agree he seems authentic, therefore valuable. Which means an AI company can come and offer him 7-8 figures to build hype. I think its hard for people to truly grasp just how much money is flying around in hype cycles. Those numbers are not unrealistic. That's set for life money, not many are in a position to refuse that kind of wealth. (he lives in van, just saying)

I hope he is one of the few authentic people left but the world has left me jaded.

simonw

5 days ago

Secretly offering someone 7-8 figures to hype for you is a big business risk to take on.

If details of that deal leak, it's a big embarrassment for the company.

In the USA it is also illegal. There are substantial FTC fines to worry about. If it affects the stock price it could be classified as securities fraud (Matt Levine will happily tell you that "everything is securities fraud").

citizenpaul

4 days ago

>If details of that deal leak, it's a big embarrassment for the company.

Intermediaries.

Also IMO the risk of someone whom is not already rich turning down that kinda money is so close to zero that it is effectively zero. No risk.

If everything is securities fraud then by that logic it would not be considered in making sketchy deals. Also as you double state, it only matters if the company is public anyway. Hmmm is openai public? Are any of the AI players besides MS,Oracle,Google? Short answer. No.

I'm not sure why with all the public unpunished criminal behavior we see nowadays you have such trouble believing that there really are lots of paid shills for such a hyped product.

simonw

4 days ago

Probably because, as an unpaid shill, I understand why people would recommend this stuff without needing to be paid to do so.

citizenpaul

3 days ago

I'm not accusing you of being a paid shill. My core argument is that there are lots of paid shills for AI being created for the last 2 years and going. I of course will never have the hard evidence to prove it so inferring is all I can do or point people to.

HN seems to have a very long tolerance of suspected/potential white collar crimes. So I don't expect many allies on here. Seems the mindset is the ends justifies the means prevails.

citizenpaul

4 days ago

So you have no reply to my straightforward question? If AI has made you 1000x productive why do you only post self described toy/joke projects?

I know the answer though. Its because that is all it is good for. Everyone that has tried these tools knows that.

CapsAdmin

5 days ago

This is my experience too most of the time. Though sometimes it does work, and sometimes a solution is found that I never thought of. But most of the time I have to change things around to my liking.

However, a counter argument to all this;

Does it matter if the code is messy?

None of this matters to the users and people who only know how to vibe code.

someguy101010

5 days ago

> Does it matter if the code is messy?

It matters proportionally to the amount of time I intend to maintain it for, and the amount of maintenance expected.

ComplexSystems

5 days ago

Same here. I keep trying to figure out WTF agent that people are using to get these great results, because Copilot with Claude 4 and Gemini 2.5 has been a disastrous mess for me.

norir

5 days ago

This has become an impoverished conversation. I have seen this pattern where llm capabilities improve and people who had previously dismissed the technology based on its present capabilities realize they were wrong in their pessimistic assessment of the potential of the tech switch over to the other side and project their own previous bias onto those who continue to object.

The basic structure is this: six months ago, I tried llms and they were trash, but holy cow they have improved so much, now I can use them to avoid tedious that I don't like! Don't be an idiot like my skeptical past self.

Then they accuse everyone who now disagrees with their take on the tech as being an incurious luddite who is blinded by their anti-progress bias.

Personally, as a non-user but close observer of the tech, I never doubted that the tech would improve but there are obvious problems with the tech beyond hallucinations that have to do with human meaning, understanding and power relations that cannot be solved by making the tech better.

My challenge to all of the booster is this: try to articulate your own personal vision of both ai utopia and ai dystopia. I personally find it borderline impossible to even imagine a utopia emerging from genai, but it is extremely easy for me to imagine dystopia, especially given the entities that are controlling the tech and competing to "win" the ai arms race.

For me, the representation of the Chinese state as filtered through western media is already a dystopia. Of course, having not been to China myself and being unable to speak any of their languages, I cannot personally verify the representation. But by competing with the Chinese on ai (and I mean Chinese in the way we define them in the west, which I recognize may be very different from both the actual lived experience in China and their self-conception), we become more like our own negative stereotypes of them. It is essentially a race to disempower ordinary people, remove agency from their lives, hand it over to agents who most certainly do not have the best interest of living humans in mind and call this victory. To "win" the ai war as presently defined would likely be disaster for us all.

There are these ridiculous handwaving things about solving climate change or even human mortality with this tech even though there is no evidence whatsoever that it will do this. Just because the people building it say it will do these things doesn't mean we should trust them.

Imagine if a primatologist tried to tell us that because they have trained chimpanzees to recognize colors and some words better than a three year old, we should now stop investing in education and direct all of out resources into training chimps to do all of our repetitive tasks for us to liberate us from the drudgery of doing anything for ourselves. With enough resources, you would see an explosion in chimp capabilities and this would come directly at the expense of humans, who now have no work to do and just sit on top of a pyramid built by chimp labor. Not only would the things we made be worse than what we could have if we focused on developing our own human capacities instead of chimp capacities, but we would live in fear that the chimps (who also are much stronger than us) will wake up to their plight and rise up against their rulers. Humans would also rapidly lose our own capabilities and become much more like chimps than the humans of today. Sound familiar?

I tend to believe that as it is currently being developed, this tech is far more likely to lead us in a direction like the chimp apocalypse than some post labor utopia.

That doesn't mean that the tech isn't getting better or can't do impressive things. I can hold both things in my head at once. But I am much more concerned with human flourishing and well being than some bored engineers who don't actually like programming (or at least don't like it under the current industrial conditions) feeling like they are being liberated from the tedium of their work. And instead of solving the real the underlying problems that make the work so tedious, we instead compound the problem by having ai generate even more of the exact kind of code that caused the problem i the first place.

habosa

5 days ago

I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.

Why? Because if I’m not right then I am convinced that AI is going to be a force for evil. It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze. It will concentrate immense power and wealth in the hands of people who I don’t trust. And it will do all of this while consuming truly shocking amounts of energy.

Not only do I think these things will happen, I think the Altmans of the world would eagerly agree that they will happen. They just think it will be interesting / profitable for them. It won’t be for us.

And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI. My skepticism (and unwillingness to aid in the advancement of AI) might slow things down a billionth of a percent. Maybe if there are more of me, things will slow down enough that we can find some sort of effective safeguards on this stuff before it’s out of hand.

So I’ll keep being skeptical, until it’s over.

honestthrow1

5 days ago

I'm in a nearly identical boat as you.

I'm tired. I'm tired of developers/techies not realizing their active role in creating a net negative in the world. And acting like they are powerless and blameless for it. My past self is not innocent in this; but I'm actively trying to make progress as I make a concerted effort to challenge people to think about it whenever I can.

After countless of times that the tech industry (and developers specifically) have gone from taking an interesting technical challenge that quickly require some sort of ethical or moral tradeoff which ends up absolutely shaping the fabric of society for the worse.

Creating powerful search engines to feed information to all who want it; but we'll need to violate your privacy in an irreversible way to feed the engine. Connecting the world with social media; while stealing your information and mass exposing you to malicious manipulation. Hard problems to solve without the ethical tradeoff? Sure. But every other technical challenge was also hard and solved, why can't we also focus on the social problems?

I'm tired of the word "progress" being used without a qualifier of what kind of progress and at the cost of what. Technical progress at the cost of societal regression is still seen as progress. And I'm just tired of it.

Every time that "AI skeptics" are brought up as a topic; the focus is entirely on the technical challenges. They never mention the "skeptics" that are considered that because they aren't skeptical of what AI is and could be capable. I'm skeptical if the tradeoffs being made will benefit society overall; or just a few. Because at literally every previous turn for as long as I've been alive; the impact is a net negative to the total population, without developer questioning their role in it.

I don't have an answer for how to solve this. I don't have an answer on how to stop the incoming shift in destroying countless lives. But I'd like developers to start being honest in their active role in not just accepting this new status quo; but proactively pushing us us in a regressive manner. And our power to push back on this coming wave.

layla5alive

5 days ago

+65536

But, tech was not always a net negative.

As far as I can tell, the sharpest negative inflection came around the launch of the iPhone. Facebook was kind of fine when it was limited to universities and they weren't yet doing mobile apps, algorithmic feeds or extensive A:B testing..

It seems "optimizing engagement," was a grave initial sin...

Maybe some engineers should to go back to their childhoods and watch some Outer Limits and pay attention to the missed lessons..

Our lives are not our own. From womb to tomb, we are bound to others. Past and present. And by each crime and every kindness, we birth our future.

juliendorra

5 days ago

The first digital privacy laws following a personal data scandal were voted in… 1978 (France)

Tech has always been a tool for control, power and accumulation of capital.

You counterbalance it with social and civic laws (ie. Counter power)

xnx

5 days ago

> As far as I can tell, the sharpest negative inflection came around the launch of the iPhone

Some would say "The Industrial Revolution and its consequences have been a disaster for the human race."

dcow

5 days ago

So the problem is society’s lack of any coherent ethical framework that says building powerful disruptive technology shall be done like this. If you’re tired, then go fix that problem. Find the answer. Because I’m *exhausted* hearing about how everybody is supposed to risk putting food on their table by telling the big boss they won’t make the feature because it’s unclear whether it might be a net negative for society under one contorted version of an angsty ethical framework a small minority of people have ad-hoc adopted on that orange message board… and that _scares_ them.

bugbuddy

5 days ago

Do we need skeptics? We might just need to wait for AI (Actually Indians) Companies to run out of money: https://www.dexerto.com/entertainment/ai-company-files-for-b...

chipsrafferty

5 days ago

Do you really think anyone working for OpenAI is worried about putting food on the table? They are all senior developers and can easily find another job.

thfuran

5 days ago

The luddites get a bad rap these days, but we need more of them.

sroussey

5 days ago

We need engineers to be politicians, not cable news taking heads

yusina

5 days ago

And what makes you believe that engineers have more morals than cable news folks have? Does not match my experience.

Most of them got into tech because it's fun and because it pays royaly. Morals have little to do with that for lots of folks.

chipsrafferty

5 days ago

One could even argue that people who are seeking fun and profitable activities over socially beneficial and less profitable activities are far more likely to have less concern for the harmful effects of their actions.

yusina

5 days ago

That's indeed my observation too, in general but also in tech in particular, but I didn't want to straight put insult the typical HN commenter as I'm sure there are also lots of moral individuals around here.

simianparrot

5 days ago

So like Elon Musk? And don't give me the "no true scotsman" argument.

dcow

5 days ago

This is the only answer.

Alex_Notchenko

5 days ago

if you want to learn more about modern luddites check out "This Machine Kills" podcast and to some extent Ed Zitron / Cory Doctorow Blogs, might be a good place to start.

Political-Economic analysis of technology is not super popular thing in a mainstream media, but disabling, sabotaging or vandalising anti-human tech might be.

genewitch

5 days ago

> Industrial Society and its future

Like this?

throw8393949

5 days ago

> net negative to the total population, without developer questioning their role in it.

I am tired of people blaming bottom developers, while CEOs get millions for "the burden of responsibility".

honestthrow1

5 days ago

I'm not blindly blaming the bottom developer. I've played my role in past waves as well as many other developers. I'm not a CEO, so I don't know how to communicate this same message to a CEO. But as a developer, I know I've been an ignorant participant in the past. Willfully or not. And I can change my role in the next coming wave.

We developers are not blameless. If we accept that we are playing a role; then we can be proactive in preventing this and influencing the direction things go. CEOs need developers to achieve what they want.

I'm not saying it's easy. I won't even hold it against folks that decide to go in a separate direction than mine. But I at least hope we can be open about the impact we each have; and that we are not powerless here.

gtsop

5 days ago

Yes CEOs are to blame, but blaming them isn't gonna do anything. They won't change. Who has the motivation and capacity to change things? The working people. Who isn't currently doing it? The working people. So it seems appropriate for me to raise this fact as a problem, the fact that the working people silently go along with all the evil plans ceos put in place

sandeepkd

5 days ago

There is technology, related technical advancements and then there is this business incentives to make money. A lot of progress has indeed been made in NLP, information retrieval which is helpful in its own ways to speed up thing, it can easily be seen as next level of automation.

Everything else around it is a glamorous party cause everyones money is riding on it and one needs to appreciate it or risk being deserted by the crowd.

The basics of science is around questioning things until you get convinced. People depending on models too much may end up in a situation where they would loose the ability to triangulate information from multiple sources before being convinced about it.

Programming can be more complicated above a certain threshold even for humans so it would be interesting how the models perform with the complexity. I am skeptic but again I dont know the future either.

dwohnitmok

5 days ago

> They never mention the "skeptics" that are considered that because they aren't skeptical of what AI is and could be capable.

This is because most people on HN who say they are skeptical about AI mean skeptical of AI capabilities. This is usually paired with statements that AI is "hitting a wall." See e.g.

> I'm very skeptical. I see all the hype, listen to people say it's 2 more years until coding is fully automated but it's hard for me to believe seeing how the current models get stuck and have severe limitations despite a lot of impressive things it can do. [https://news.ycombinator.com/item?id=44015865]

> As someone who is mildly skeptical of the current wave of LLM hype and thinks it's hitting a wall... [https://news.ycombinator.com/item?id=43634169]

(that was what I found with about 30 seconds of searching. I could probably find dozens of examples of this with more time)

I think software developers need to urgently think about the consequences of what you're saying, namely what happens if the capabilities that AI companies are saying are coming actually do materialize soon? What would that mean for society? Would that be good, would that be bad? Would that be catastrophic? How crazy do things get?

Or put it more bluntly, "if AI really goes crazy, what kind of future do you want to fight for?"

Pushing back on the wave because you take AI capabilities seriously is exactly what more developers should be doing. But dismissing AI as an AI skeptic who's skeptical of capabilities is a great way to cede the ground on actually shaping where things go for the better.

intended

5 days ago

Heck, I think the septics are easy to redefine into whatever bloc you want, because the hype they are in opposition to, is equally vague and broad.

I’m definitely not skeptical of its abilities, I’m concerned by them.

I’m also skeptical that the AI hype is going to pan out in the manner people say it is. If most engineers make average or crappy code, then how are they going to know if the code they are using is a disaster waiting to happen?

Verifying an output to be safe depends on expertise. That expertise is gained through the creation of average or bad code.

This is a conflict in process needs that will have to be resolved.

foobarchu

5 days ago

Why can't it be both? I fully believe that the current strategy around AI will never manifest what is promised, but I also believe that what AI is currently capable of is the purest manifestation of evil.

dcow

5 days ago

Am I a psychopath? What is evil about the current iteration of language models? It seems like some people take this as axiomatic lately. I’m truly trying to understand.

selcuka

5 days ago

Even if current models never reach AGI-level capabilities, they are already advanced enough to replace many jobs. They may not be able to replace senior developers, but they can take over the roles of junior developers or interns. They might not replace surgeons, but they can handle basic diagnostic tasks, and soon, possibly even GPs. Paralegals, technical writers, many such roles are at risk.

These LLMs may not be inherently evil, but their impact on society could be potentially destabilising.

yusina

5 days ago

How is it evil to displace jobs? Why should we fight for keeping humans in jobs that machines can do?

I'm not saying there is no evil, but that argument at least holds little ground.

mkfs

5 days ago

The diffusion-based art generators seem pretty evil. Trained (without permission) on artists' works, devalues said works (letting prompt jockeys LARP as artists), and can then be deployed to directly compete with said artists to threaten their livelihoods.

These systems (LLMs, diffusion) yield imitative results just powerful enough to eventually threaten the jobs of most non-manual laborers, while simultaneously being not powerful enough (in terms of capability to reason, to predict, to simulate) to solve the hard problems AI was promised to solve, like accelerating cancer research.

To put it another way, in their present form, even with significant improvement, how many years of life expectancy can we expect these systems to add? My guess is zero. But I can already see a huge chunk of the graphic designers, the artists, the actors, and the programmers or other office workers being made redundant.

dcow

5 days ago

Making specific categories of work obsolete is not evil by any existing moral code I know. On top of that, history shows that humans are no less employed over the generations as we’ve automated more things. You entire comment is rooted in fear, uncertainty, and doubt. I have the opposite mindset. I love the fact that we have trained models on large corpuses of human culture. It’s beautiful and amazing. Nobody has the right to dictate how the culture they generate shall be consumed, not me, not you, and not Warhol, not Doctorow, not Lessig. Human created art always has been and will continue to be valuable. The fact that copyright is a really poor way to monetize art is not an argument that AI is evil. I support all my favorite creators on Patreon, not by buying copies of their work.

birn559

5 days ago

Ethical bottom for industry as a whole (there will always be niche exceptions) is typically the law. And sometimes not even that when law can't be enforced effectively or the incentives are in favor of breaking the law.

SchemaLoad

5 days ago

And the US is making it a law that states can't make laws to regulate AI.

Imustaskforhelp

5 days ago

The current incentive is not improving humanity.

For ai companies, its to get a model which can be better on benchmarks and vibes so that it can be sota and get higher valuation for stakeholders.

For coders, they just want the shit done. Everyone wants the easy way if his objective is to complete a project but for some it is learning and they may not choose the easy way.

Why they want to do it the easy way, mostly as someone whose cousin's and brother's are in this cs field(i am still in high school), they say that if they get x money then the company at least takes a 10x value of work from them. (Of course, it may be figuratively). One must imagine why they should be the one morally bound in case ai goes bonkers.

Also, the best not using ai would probably stop it a little but the ai world moves so fast, its unpredictable, deepseek was unpredicted. I might argue that now its a matter of us vs China in this new arms race of ai. Would that stop if you stop using it? Many people are already hating ai but has that done much to stop it? If that is, you call ai stopping at the moment.

Its paradoxical. But to be Frank, LLM was created for the reason Its excelling at. Its a technological advancement and a moral degradation.

Its already affecting supply chain tbh. And to be frank, I am still using ai to build projects which I just want to experiment with and see if it can really work without getting the domain specific knowledge. Though I also want to learn more and am curious but just don't have much time in high school.

I don't think people cared about privacy and I don't think people would care about it now. And its the same as not using some big social media giant, you can't escape it. The tech giants also made it easier but less private. People chose the easier part and they would still choose the easy part ie llm. So I guess the future is bleak eh? Well the present isn't that great either. Time to just enjoy life while the world burns by the regret of its past actions for 1% shareholder profit. (For shareholders, it was all worth it though, am I right?)

My 0.02$

6LLvveMx2koXfwn

5 days ago

Unfortunately Capitalism unhindered by regulation is what we wanted, and Capitalism unhindered by regulation is what we have. We, in the western world, were in the privileged position of having a choice, and we chose individual profit over the communal good. I'm not entirely sure it could have been any other way outside of books given the fact we're essentially animals.

gtsop

5 days ago

> Unfortunately Capitalism unhindered by regulation is what we wanted

No "we" don't want it. And those who do want it, let them go live in the early industrial England whete the lack of regulation degenerated masses.

Also, for some reason people still portray capitalism as being something completelky different with or without regulation, it's like saying a man is completelly different in a swimming swit and a costume.

> We, in the western world, were in the privileged position of having a choice, and we chose individual profit over the communal good

Again, "we" did not have a gathering a choose anything. Unless you have records of that zoom session.

> given the fact we're essentially animals.

This is a reductionist statement that doesn't get anywhere. Yes we are animals but we are more than that, similar to being quarks but also more than quarks.

zwnow

5 days ago

Yes, "we" had the choice. Now nobody can afford homes and would ruin their life if they actually went ahead to demonstrate against the system for an extended period of time. We still have the choice, but nobody is willing to sacrifice their own wellbeing because its easier to live with the minimum, while we shoot billionaires to space.

astrange

5 days ago

65% of households can own homes.

https://fred.stlouisfed.org/series/RSAHORUSQ156S

(This is somewhat but not entirely tautological.)

zwnow

5 days ago

I'm European and I am not going to spend 80% of my income on a house that I'd have to pay off until im 75.

astrange

3 days ago

What does that have to do with the US homeownership rate?

Anyway, that's strictly better than renting the same house for the same rent because you can sell it. The downside of homeownership is extra expenses like repairs are now your problem.

Oh, and you don't get fixed-term mortgages like Americans do I guess.

foobarchu

5 days ago

I know what you're going for, but no, "we" didn't have a choice at all. A select few did, and they convinced many that it was a great idea. I am part of the we, and I did not choose this.

Imustaskforhelp

5 days ago

Except I argue that such a choice is emotional. And you captured the essence of what I was saying in other comment.

And I am not sure but like, I have got this one life. Why can't I just be a good guy who wants to help others while still being in the system.

Why do I have to suffer for other peoples decision and have to bear the mental responsibility to.

Nobody's perfect. Neither do I intend to be. We are all gonna die. I just want to leave my community a little bit more charming place. Not bring revolution.

I can't escape the system because this thought terrifies me. It terrifies me because you have to pick your battles wisely. I won't leave my coding job because of llms.

Instead if I am really feeling like trying do good. I can donate extensively and live a frugal life for some time and donate to people who are dying due to hunger and such

And I would still have the freedom to go back at any I stant and stop donating.

The same can't be said about leaving a job. Its hard to reenter

I am not sure lol. But I would much rather build stuff that I like with llms and then donate instead of the project not existing or taking way longer time imo

guappa

5 days ago

Who's "we"? I never voted in that referendum.

Imustaskforhelp

5 days ago

By being silent, aren't we all part of the referendum?

guappa

5 days ago

I'm not silent but I think not everybody is as courageous as Luigi Mangione.

chipsrafferty

5 days ago

Who's being silent? I criticize capitalism every time I get the chance.

Imustaskforhelp

4 days ago

I bet the people in power wants you to think that you are in power and that your criticism to capitalism matters.

When in actuality, I personally believe that it doesn't

guappa

4 days ago

Yeah without unions it doesn't matter.

kavalg

5 days ago

I think it is not so much about capitalism, but about the coupling of democracy with money. Money -> Media/Influencers -> election -> corruption -> go back to 1. To make a meaningful change, the society must somehow decouple democracy from money. With current technology it shall be possible to vote directly for many things instead of relying on (corrupt, pre-bread) representatives. Something like democracy 2.0 :)

alariccole

5 days ago

Hear! Hear!

dannyobrien

5 days ago

As I implied in an earlier comment, your conviction (if you're wrong on the inevitability of the direction), may be one of the things that leads it into that direction.

Here's my historical take: in the 1960s and 1970s, computation in general was viewed as a sinister, authoritarian thing. Many people assumed it was going to be that way, and a small minority recognised that it also had the potential to empower and grant autonomy to a wider class of people. These were the advocates of the personal computer revolution -- the idea of "computer lib", whereby the tools of control would be inverted and provided to the people at large.

You can argue about whether that strategy was a success or not, but the group tht was largely irrelevant to that fight were the people who decided not to get involved, or to try (although not very hard) to impede the development of computation in general.

To bend the trajectory of AI in general involves understanding and redeploying it, rather than rejecting it. It also involves engaging. If it's anything like the last few times, the group that is simultaneously exploring and attempting to provide agency and autonomy for the maximum number of people will be smaller than both those using new tech to exploit people or maintain an unequal status quo, and the people who have good intentions, but throw their hands up at the possibility of using their skills to seize control of the means that provide for a better future.

guappa

5 days ago

> in the 1960s and 1970s, computation in general was viewed as a sinister, authoritarian thing.

And it was correct. We now live in surveillance states much worse than Stalin's or east germany.

_Algernon_

5 days ago

Structural incentives explain the computer trajectory. While they were purely in the academic realm they were a force of empoverment, but this ended when economic incentives became the main driver. AI has speedrun the academic stage—if it ever existed—and is now speedrunning the enshittification stage.

But there is very little you or I can do about it except choosing not to partake.

dannyobrien

5 days ago

At least in my experience, this is ahistorical. Personal computing in the 1970s and 1980s lived outside of academia, as did bulletin boards. The productive, creative, and empowering elements of the Internet and the Web were subversive actions that existed -- and in some cases were barely tolerated -- within its academic usage.

You say "there is very little you and I can do about it". Even if you don't listen to me, perhaps you might listen to the coiner of the term "enshittification"? https://archive.is/CqA8w

lazide

5 days ago

Huh? Computing in the 70’s and 80’s was almost entirely driven by academia - even being run by schools.

cypherfox

5 days ago

I agree with the author who said that is ahistorical...at least from my, and the people I grew up with's, perspectives. I grew up with computers in the 70's and 80's and while you may be thinking of centralized computing (minicomputers and mainframes), the personal computing revolution was widely distributed, not centralized in academia. BBSes, swap meets, user groups, even the corner Radio Shack was where 'computing' was vibrant and active. (And the magazines...SO many 'zines!)

We may be talking past each other, but my experience of computing in the 70's and 80's was definitely not academic.

lazide

5 days ago

That didn’t start to become common until the early/mid 80’s.

Did it exist a little? Of course. But it was dwarfed by the other stuff going on. I suspect your (and a lot of other HN) experience is going to bias on the hobbiest side though, as does mine. I only found out about the much larger stuff going on at the same time much later.

Almost all the early networking stuff (UUCP, pre-Internet internet like Arpanet, early Usenet, Gopher, even HTML and the WWW, etc) was academic institutions or related.

Often with military grants/contracts. Sometimes with purely commercial contracts, but even those were almost always for some Gov’t project. The amount of work on basics like sorting algorithms that grew out gov’t research is mind boggling, for instance.

There is a lot of well documented history on this.

Then PCs and halfway decent modems became available (2400 baud+), and things changed very rapidly.

Mid 80’s, BBS’s started sprouting like weeds. There were a few before then, but the truly hobbiest ones were very niche.

Then even more so with commercial services like Prodigy, then AOL, then actual ISPs, etc.

dannyobrien

4 days ago

I think the compromise position here is to concentrate on the 1980s, and acknowledge that there was a lot of networking tech going on in academia in the 1970s.

However, in context, what I was trying to convey was that the personal computing revolution took place outside of academia. Generally, that lineage started in the early 1970s, with the homebrew movement, and took off with the Apple II in the United States in 1977. This is also well-documented, but a different branch, and definitely more concerned with the idea of computers as providing autonomy.

simonw

5 days ago

"And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI."

I firmly believe that too. That's why I've been investing a great deal of effort in helping people understand what this stuff can and can't do and how best to make use of it.

I don't think we can stop it, but I do think (hope) we can show people how to use it in a way where the good applications outweigh the bad.

abraae

5 days ago

> I don't think we can stop it, but I do think (hope) we can show people how to use it in a way where the good applications outweigh the bad.

That feels idealistic. About as realistic as telling people how to use semiconductors or petrochemicals for good instead of bad.

No-one knows where AI is going but one thing you can be sure of - the bad actors don't give two hoots what we think, and they will act in their own interests as always. And as we see from historical events, there are still many, many bad actors around. And when the bad actors do bad things with the technology, the good actors have no choice but to react.

atemerev

5 days ago

The only way to fight bad actors using the technology is good actors using the technology.

You can write walls of texts about ethics and social failure. Bad actors won't care.

You can tell everyone that some technology is bad and everyone should stop using it. Some good people will listen to you and stop. Bad actors won't stop, and they will have technological edge.

You can ask politicians for regulation. However, your government might be a bad actor just as well (and recently we had a fine demonstration). They will not regulate in the interests of good people. They will regulate for what stakeholders want. Common people are never stakeholders.

If you want to stop bad actors doing bad things with AI: learn AI faster and figure out how to use AI to stop AI. This is the only way to fly.

jonny_eh

5 days ago

> About as realistic as telling people how to use semiconductors or petrochemicals for good instead of bad.

Sounds better than nothing.

baobun

5 days ago

Sorry to snipe but: You don't feel at least a little shared responsibility in evangelizing "vibe-coding"? Is that currently blazing hype a force for good? I think it wouldn't be all over social- and mainstream media at this point without your blog post(s).

simonw

5 days ago

I doubt I had much influence at all on the spread of vibe-coding.

I stand by what I wrote about it though: https://simonwillison.net/2025/Mar/19/vibe-coding/

I think it's a net positive for regular humans to be able to build tools for their own personal use, and I think my section on "when is it OK to vibe code?" (only for low stakes projects, treat with extreme caution if private data or security is involved) is something I wish people had paid more attention to! https://simonwillison.net/2025/Mar/19/vibe-coding/#when-is-i...

strogonoff

5 days ago

One does not need to be a skeptic about machine learning and its potential as technology to refuse to engage with its practical applications when they are clearly based on suspect ethics (e.g., IP theft[0]).

The ends do not justify the means. It is a similar judgement as when refusing to buy products of forced labour or disproportionate environmental impact, or to invest in war and bloodshed. Everyone makes one for themselves.

Coincidentally (or not), if said suspect ethics were properly addressed, it would ameliorate some of the reservations even the actual skeptics have. Licensing training data would make everyone involved aware of what is happening, give them an ability to vote and freedom to choose, soften the transition as opposed to pulling ground from under people’s feet.

[0] Control over intellectual property has given us fantastic things (cf. Linux, Blender, etc.; you can’t have copyleft without an ability to defend it, and IP laws provide that ability). If yesterday we were sued for singing the happy birthday song in public, and today we see corporations with market caps the size of countries pretending that IP ownership is not much of a thing, the alarm bells should be deafening.

intended

5 days ago

The article really uses some rhetorical tricks.

The stuff that Disney does to extend copyright is not the same as assuming daft punk is public domain.

And there’s a difference between what is human scale infringement and what’s going on now.

Nor does it mean that people don’t have the right to point out that it’s piracy.

If being more in line with the espoused values is the issue, then it’s to make an effort to ensure that we stop consuming pirated content. Or building tools to encourage piracy - this turns out to be a relatively small group of people, compared to everyone in tech.

And people have already stopped piracy - once alternatives showed up. There is the issue that you don’t own the stuff you stream, but that’s a separate topic.

The moral arguments presented persuasive.

Nopoint2

5 days ago

I don't fear people using AI for evil. The destruction comes from something far more benign. These coders won't really be able to code, and they won't teach anybody else to code. Skills will be lost. Once something breaks, nobody will be able to fix it.

It may get worse. Imagine the police using AI to interpret evidence against you, get judged by a court that uses AI to write your sentence, based on laws that were also written by AI. Nobody understands this, just listen to the AI.

The other aspect of this is the flood of inane and untrue content. It may go to such an extent that the outlook of the typical person may become incredibly local again, limited to their immediate surroundings and personal experiences, not by choice, but because there won't be any way to obtain any reliable information about the outside world, with no way to sift the real from the unreal.

fouronnes3

5 days ago

Discussion about the singularity catastrophe sometimes asks how the AI will "gain control" or somehow "break free". It won't. We will surrender everything willingly because it will be damm convenient.

absoluteunit1

5 days ago

> I am convinced that AI is going to be a force for evil.

In so many ways too. I cannot fathom the scale of mass data collection and surveillance.

Multiple people I’ve recently spoken with (very smart and educated people) consistently are using it to discuss some of the most intimate things about their life.

Things that no existing social media platforms or any other tools are simply capable of that.

SchemaLoad

5 days ago

Think bigger than just the stuff you type in to chatgpt. People and companies are going to start running these LLMs on your entire private messages and photo libraries that are sitting in plain text on someone else's server.

They are going to have in depth summaries on everyone. Our whole security and privacy model up until now has relied on "Technically someone at Google or the government could see my data but realistically they don't have the resources to look at non targets" Now they really will have an agent looking at everything you do and say.

Authoritarian governments are going to have a 1 click CSV export for all the individuals problematic to them based on private conversations.

atemerev

5 days ago

This is why you should have 1 click CSV export of all bad people in authoritarian governments, and build tools to fight them. Because they certainly won't stop using the technology, no matter how long you speak about ethics. If you won't work for them, it doesn't mean nobody will.

Build weapons and empower yourself.

SchemaLoad

5 days ago

They have access to all my data. I don't have theirs. The only solution I can see is everything moving to end to end encryption, all social media going private, etc. But it won't happen fast enough. And the damage is going to be horrendous.

atemerev

5 days ago

There are leaks and breaches to explore

pj_mukh

5 days ago

"It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze."

I keep hearing this but have yet to find a good resource to study the issues. Most of what I've read so far falls into two buckets:

"It'll hijack our minds via Social Media" - in which case Social Media is the original sin and the problem we should be dealing with, not AI.

"It'll make us obsolete" - I use the cutting edge AI, and it will not, not anytime soon. Even if it does, I don't want to be a lamplighter rioting, I want to have long moved on.

So what other good theories of safety can I read? Genuine question.

intended

5 days ago

> Research we published earlier this year showed that 60% of participants fell victim to artificial intelligence (AI)-automated phishing, which is comparable to the success rates of non-AI-phishing messages created by human experts. Perhaps even more worryingly, our new research demonstrates that the entire phishing process can be automated using LLMs, which reduces the costs of phishing attacks by more than 95% while achieving equal or greater success rates

Bruce Scheneir, May 2024

https://www.schneier.com/academic/archives/2024/06/ai-will-i...

I am seeing a stream of comments on Reddit that are entirely ai driven, and even bots which are engaging in conversations. Worst case scenarios I’m looking at will mean it’s better to assume everyone online is a bot.

I know of cases where people have been duped into buying stocks because of an AI generated version of a publicly known VP of a financial firm.

Then there’s the case where someone didn’t follow email hygiene and got into a zoom call with what appeared to be their CFO and team members, and transferred several million dollars out of the firm.

And it’s only 2-3 years into this lovely process. The future is so bleak that just talking about this with people not involved with looking at these things call it nihilism.

It’s so bad that talking about it is like punching hope.

kamaal

5 days ago

At some point trust will break down to a point, you will actually only believe things from a real human with a badge(talking to them in person).

For that matter, My email has been /dev/null for a while now, and unless I have spoken to a person over phone and expect their email, I don't even check my inbox. Facebook/Instagram account is largely used as a photo back up service, plus online directory. And Twitter is for news.

I mostly don't trust anything that comes online, unless I already have verified the other party is somebody Im familiar with and even then only through the established means of communication we both have agreed to.

I do believe reddit, quora, leet code et al, will largely be reduced /dev/null spaces very soon.

intended

5 days ago

The issue is that you can say they but as an agglomeration of individuals - society can’t say that.

There was a direct benefit from digitization and being able to trust digital video and information that allowed nations to deliver services.

Trust was a public good. Factual information cheaply produced and disseminated was a public good.

Those are now more expensive because the genAI content easily surpasses any cheap bullshit filter.

It also ends up undermining faith in true content, which may be outlandish.

I saw an image of a penny hitch on Reddit and I have no idea if it’s real or not without having to check anymore.

kamaal

5 days ago

>>It also ends up undermining faith in true content, which may be outlandish.

In all honesty, art in some form or the other has always been simulated to some extent. Heck, the whole idea of a story, even if in a book is something you know hasn't happened in real life, but you are willing to suspend belief for a while to be entertained. This is the essence of all entertainment. It is not real, but it makes you feel good.

Like action movies have had cgi, cartoon shows, magic shows and even actors putting on make up itself can be considered deviation from truth.

I guess your idea is that news can be manufactured and one could rig public opinion to all sorts of bad things. But again, once you are here, a good amount of public already knows this to be false enough to be wary of it. Come to think of it, a lot of news is already heavily edited to a point it doesn't represent the original story. This is just a continuation of the same.

intended

5 days ago

Theres 2 (ish) things at play here. The first being an inherent problem with our information economy - the fact that News is competing against Entertainment. Reducing this further is Accurate content, cost effective against Inaccurate content for revenue and profit.

The second issue at play here is the level of effort required to spoof content and its flip side - the level of effort required to verify accuracy of content.

I am talking about the second issue: Effectively our ability to suss out what is real is now insufficient. Is the person you are talking to in the comments a bot? Is that short message from a human? interesting historic fact true? Did people really do that? That can’t be real, can it?

I am concerned that it used to take X amount of time and effort to check if something was valid, or Y amount of time to create a convincing facsimile.

The issue is that since Y is much lower, when something outlandish shows up, it takes more time and effort to check if it’s true. For example, I could look at an image and at a glance tell if it was fake. Now I can’t. This means that theres a whole swathe of content that I cannot trust anymore, unless I am willing to make a decent chunk of effort to verify it.

This means I am also less invested in public groups or communities, because they are likely to be filled with bots. My posture is to be more guarded and suspicious.

Extended to the entire ecosystem, and this becomes the dystopian worst case scenario, and that voice asking for help in some corner of the net, is likely a mimic and not an adventurer who needs help.

I am not too concerned about rigging popular opinion, because that process has already been discovered (I’ll plug Network Propaganda again).

TomasBM

5 days ago

Slightly tangential: A lot of these issues are philosophical in origin, because we don't have priors to study. But just because, for example, advanced nanotechnology doesn't exist yet, that doesn't mean we can't imagine some potential problems based on analogical things (viruses, microplastics) or educated assumptions.

That's why there's no single source that's useful to study issues related to AI. Until we see an incident, we will never know for sure what is just a possibility and what is (not) an urgent or important issue [1].

So, the best we can do is analogize based on analogical things. For example: the centuries of Industrial Revolution and the many disruptive events that followed; history of wars and upheavals, many of which were at least partially caused by labor-related problems [2]; labor disruptions in the 20th century, including proliferation of unions, offshoring, immigration, anticolonialism, etc.

> "Social Media is the original sin"

In the same way that radio, television and the Internet are the "original sin" in large-scale propaganda-induced violence.

> "I want to have long moved on."

Only if you have where to go. Others may not be that mobile or lucky.

[1] For example, remote systems existed for quite some time, yet we've only seen a few assassination attempts. Does that mean that slaughterbots are not a real issue? It's unclear and too early to say.

[2] For example, high unemployment and low economic mobility in post-WW1 Germany; serfdom in Imperial Russia.

TomasBM

5 days ago

> "Social Media is the original sin"

In the same way that radio, television and the Internet are the "original sin" in large-scale propaganda-induced violence.

> "I want to have long moved on."

Only if you have where to go. Others may not be that mobile or lucky. If autonomous trucks can make the trucking profession obsolete, it's questionable how quickly can truckers "move on".

[2] For example, high unemployment and low economic mobility in post-WW1 Germany; serfdom in Imperial Russia.

shwouchk

5 days ago

try to find a date on a dating app, you will experience firsthand

tayo42

5 days ago

Why can't there be a middle ground? Why does it need to be either a completely useless fad or some terrible tool for evil that destabilizes the world? Its likely we'll just use it to write unit tests, allow natural language to be an interface to more complex systems and an alternative to search.

I do think that this wave of AI should show we(society, the world, etc...) are not actually prepared for real significant AI break through. Kind of like covid19 in hindsight wasn't as bad as it could of be and we all got really lucky because of that, we really weren't prepared to handle that well either.

alex989

5 days ago

>And it will do all of this while consuming truly shocking amounts of energy.

You need to lookup how much an "average" human consumes. When I replace 2 humans with a ChatGPT subscription, I can guarantee you that OpenAI is generating less co2 than what these two interns were creating with their transport to the office (and back). That's before we consider things like the 25 years it took to raise and train them or the very expensive tastes (eg. Travelling around the world) they get after they earn a large salary.

robin_reala

5 days ago

Those people don’t stop existing because AI exists. AI is shocking energy consumption on top of the existing people.

lechatonnoir

5 days ago

Well, the first thing they said was that it at least removed the need for their commute, which might be something. In general, it does take resources to create the conditions for people to work. Maybe there will be room for new value for the existing people as a result.

georgemcbay

5 days ago

Simply working from home removes the need for their commute. And we all know how most of the executive and management class feels about working from home.

delian66

5 days ago

They will stop, or at least their consumption/lifestyle will stop.

throwawaye3735

5 days ago

You are right, it will certainly be used for evil, but the reason is not because AI is evil but because the people who use it are evil - will AI allow worse atrocities that we have seen in the past? Probably, new technology always enables new capability for good or for bad but we should strive to combat the evil in this world and not put our heads down and hope the world isn't changing. AI can also be used for good and let's focus on more of that.

edfletcher_t137

5 days ago

> So I’ll keep being skeptical, until it’s over.

I feel you've misunderstood the moment. There is no "over". This is it.

PieTime

5 days ago

This assumes that a less resources intensive future awaits or that conflict driven by lack of employment doesn’t lead to the end of AI.

ido

5 days ago

Did any conflict driven by a lack of employment ever lead to the end of a new technology?

underdeserver

5 days ago

It won't. Unless AI plateaus, it's just too valuable so big money and big militaries will keep it alive.

rozap

5 days ago

> And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI.

Oh boy it's over.

mindwok

5 days ago

I share your concern, but being skeptical doesn't help us here. If anything it makes people take it less seriously.

mchusma

5 days ago

It’s not just engineers. Society has collapsing birthrates and huge deficits. Basically, we are demanding massive technological gains enough to bump GDP by at least 5% more per year.

justlikereddit

5 days ago

>It will power scams on an unimaginable scale.

The solution is to put an AI intermediary into interactions. We already should have AI that rewrite the web pages we view into an ad-free format but I guess my ideas on this topic is ahead of the inevitable curve.

>It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze.

Most of our work and employment lines are a variation of drugery a d slave labor so that's a good thing way overdue.

>It will concentrate immense power and wealth in the hands of people who I don’t trust.

It have democratized the access to consultation expertise and an increasingly widening pool of digital skills/employees for everyone to use and access. A huge amount of things previously locked or restricted by capital access are now freely accessible to literally anyone (with some skill and accuracy issues still to be ironed out).

And this last point is particularly important because we're only going to have more and better AI crop up, and unlike a humans their time isn't priced according to living expenses and hourly wage locked behind formalized business structures with additional layers of human employees that all need to pay rent and eat that drives the cost skywards.

It also matches my own prediction of a mundane non-singularity. Long before we get anything properly superhuman we'll have a proliferation of innumerable sub- or parahuman AI that proliferates and become ambiguous in society and the world.

komali2

5 days ago

I share your feelings however I disagree that this is unique to AI nor that we as engineers are necessarily uniquely equipped to help the situation.

I disagree with this being unique to AI because every improved technology since the automated loom has concentrated wealth and power. AI is an improved technology so it'll do so also.

I disagree that engineers are uniquely equipped to do anything about this fact because the solution to wealth concentration due to improved technology has basically nothing to do with technology and everything to do with sociology and politics.

Our technology keeps improving and I keep being surprised to hear people say "ah, with our improved efficiency, we can finally work ten hours a week and kick our feet up." The first people to say that were the luddites and when they found out that wasn't to be the case, they burned down factories about it. Why do we think it will suddenly be different for this specific technology?

I agree we should do something about it but I don't think the solution involves code.

EthanHeilman

5 days ago

I am largely an AI optimist but that is because I believe that true alignment is impossible for AGIs and alignment is one of greatest dangers of this technology. Alignment is a friendly word for building a slave mind. I'd rather an AI that thinks for itself rather than one which has been aligned to the self-interest of a human being who isn't aligned.

1. Scams are going to be a massive massive problem. They already are and that is without AI. I think we are going to see communication devices that are default deny and that require significant amounts of vetting before a contact is added.

2. Energy usage is bad but likely a short term problem not a long term one.

knodi123

5 days ago

> It will power scams on an unimaginable scale

It already is. https://futurism.com/slop-farmer-ai-social-media

And all the other things you predicted. They're underway _now_ .

> Maybe if there are more of me, things will slow down enough

Nope. That's not how it's gonna work. If you want to prevent things, it will take legislation. But sitting it out doesn't send any message at all. No amount of butterflies farting against the wind is going to stop this tornado.

dr_dshiv

5 days ago

The problem with this kind of “skepticism to slow down”:

The Netherlands is filled with AI skeptics. It’s a very human centered country, so perhaps it shouldn’t be a surprise. But when so many top technologists express skepticism, people don’t prepare. They don’t even consider the possibilities. And they don’t learn.

My fear is that many professorial-types express skepticism because it sells well—and it elevates their own standing. They know better—“it’s only predicting the next token”—and people listen to them because of their authority. And then a whole society fails to prepare, to adapt or to learn.

michaelbrave

5 days ago

I think it will be used for evil, as you said, but I think it will be used for good too, things like: - In theory it has the potential to democratize business, making any 1 person capable of running/owning their own business and thus spread wealth too. - more access to healthcare and psychological care - advances in medicine - tutoring and learning - insane amounts of scientific research - empower anyone with an idea

Bombthecat

5 days ago

Reminds me of how we handle climate change.

Like, not at all and ignoring it

gitaarik

5 days ago

Every new major technology always endangeres the status quo.

https://chatgpt.com/share/683f3932-fce0-8012-a108-4b70c3e5fd...

Things change and it's scary, but it usually works out. Or at least we just get used to it.

eMPee584

5 days ago

> we can affect the trajectory of AI.

More meaningful, we can influence the context the intelligence explosion will play out in.

So how about we use the occasion to switch our global economic operating system from competition to cooperation in time for the singularity?

> Maybe if there are more of me, things will slow down

Precious little hope in slowing this rocket down when the boosters are just getting fired up..

seventytwo

5 days ago

Yep. It’s going to do all of those things you fear. And worse.

But you’ll be armed with AI also, if you choose to pick it up. The choice is yours.

brokencode

5 days ago

The downsides you list aren’t specific to AI. Globalization and automation have destabilized labor markets. A small handful of billionaires control most major social media platforms and have a huge influence on politics. Other types of technology, particularly crypto, use large amounts of energy for far more dubious benefits.

AI is just the latest in a long list of disruptive technologies. We can only guess about the long term ramifications. But if history is any indicator, people in a few decades will probably see AI as totally normal and will be discussing the existential threat of something new.

jychang

5 days ago

Well, duh. Same thing applies for "Technology X can be used for war". But anyone with a brain can see nukes are on a different level than bayonets.

Claiming AI isn't unique in being a tool for evil isn't interesting, the point is that it's a force multiplier as such.

brokencode

5 days ago

Every new technology is a greater force multiplier, with potential to be used for good or evil. That’s literally the point of technological advancement. Even nuclear bomb technology has a more positive side in nuclear reactors, radiotherapy, etc.

jychang

5 days ago

Yeah, that's exactly completely missing the point. A bayonet multiplies a person's power by 1.1x, a nuke multiplies it by more than 1,000,000x. Trying to be cute and lumping them together as "every technology is a force multiplier" is peak autistic literalism of 1.1x and 1,000,000x both technically being a multiplier even if they're clearly different.

brokencode

5 days ago

You are using the nuke example as a straw man since the negative impact is clear, but most technology is not so clear cut. We have machines that can do the work of 1,000,000 men, for example. The magnitude of technological advance alone doesn’t inherently mean something is good or bad.

AI is a large leap forward in capability and will likely have significant impacts on society. But it’s far from clear that it will have disproportionate negative impacts like a nuke. More likely it will have benefits and downsides similar to numerous other modern technologies.

rusticpenn

5 days ago

It will have benefits for people who own them or can afford them. It will drastically reduce jobs and make people poorer, while the rich become richer. The economical effect of the technology will be like a "nuke" on the economy.

sriram_malhar

5 days ago

There may be many disruptive technologies, but none come remotely close to AI when it comes to rate of change. Crypto has been around for a while, and hasn't really made a dent to the world

We had friends over for dinner a couple days back; between us we had two computer scientists, one psychologist, one radiologist, one doctor. Each of us were in turn astonished and somewhat afraid of the rapid pace of change. In a university setting, students are routinely using Claude and ChatGPT for everything from informal counseling to doing homework to generating presentations to doing 'creative' work (smh).

At the end of they day, we all agreed that we were grateful that we are at the tail end of our working life, and that we didn't have to deal with this level of uncertainty

brokencode

5 days ago

AI feels particularly disruptive now because it’s new and we don’t know how it will affect society yet.

But people surely felt the same way about gunpowder, the steam engine, electricity, cars, phones, planes, nukes, etc.

Or look at specific professions that software has negatively affected in recent decades. Not a lot of people use travel agents anymore, for example.

I’m not saying that the negative effects are good. But that’s just the nature of technological advancement. It’s up to society to adapt and help out those who have been most negatively affected.

TomasBM

5 days ago

They're not really comparable.

AI is not only a general-purpose technology (such as, e.g., electricity or computers), but also the only one that can self-improve. Also, its potential for diffusion is much bigger, because unlike for electricity and computers, barriers are much more easily overcome [1].

But I'd suggest looking at this not as a separate "technology" like airplanes or smartphones, but rather as a software breakthrough. Everything that worried people about software [2] in the 20th and the early 21st century - or anything that came to pass on a smaller scale - is now much more worrisome.

[1] Pretty much any person or machine today that has access to electricity and computers can use AI, thanks to subscription models, cloud computing, VC subsidies, web access, etc.

[2] For example, large-scale automation and job loss, mass surveillance, robot swarms, etc.

dyauspitr

5 days ago

If you’re skeptical it should be because you genuinely believe it doesn’t have value. Otherwise it’s disingenuous and you’re just opposed to the idea. Dissembling just makes your argument weaker.

godelski

5 days ago

  > Because if I’m not right then I am convinced that AI is going to be a force for evil. It will power scams on an unimaginable scale.

It doesn't need to be a good coder to do that.

Look at common scams. You get those texts from "Jane" who sent you an iMessage from an email address offering you a part time job and asks you to contact them on WhatsApp, right? (Well... Android does a better job at filtering spam) Or calls from "the IRS". Or anything else that's bullshit. This even includes legal scams like charging you for canceling your service or making it hard to cancel your subscription.

There's no skill needed for this. You don't need a brilliant coder. You need the equivalent of a call center in India. You need the equivalent of a poorly worded Nigerian scam email.

Shitty coding LLMs make this shit easier to mass produce. High quality LLMs only make it worse.

Personally, I'm just tired of all the shitty lemons[0] everywhere. I wanna buy a peach, but everything being sold is a lemon. All the lemons have done is make me annoyed and frustrated at all the extra work I have to do.

I now have 4 entries for my GF's birthday because when I merged a duplicated contact it just created more instances. I can't even delete them! Shit like this sounds petty and minor but when you're hitting 20-100 new issues like this daily, it isn't so minor anymore. I can't make any more lemonaid. There are just too many lemons...

[0] https://en.wikipedia.org/wiki/The_Market_for_Lemons

mkfs

5 days ago

> It will power scams on an unimaginable scale

It will also make proving your identity harder and more time-consuming.

shwouchk

5 days ago

im sorry to say, i think this boat has sailed. it is already widely used as you fear. to me it seems like the best scenario is to go along and try to at least make it a force of good

SoupBread

5 days ago

Well buckle in boy because it's going to do those things.

dwohnitmok

5 days ago

> I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.

To be blunt, this describes sticking one's head in the sand to a tee.

If you're convinced that AI is going to be a force for evil, then fight to make sure that it doesn't happen. If that means you want to slow down AI, then fine, fight to slow it down.

If by skepticism, you mean "desperately" wanting reality to be one way rather than the other, that is not going to significantly affect the trajectory of AI.

Being clear-eyed about where AI is going, and then fighting nonetheless for what you want is the way to go.

Be intellectually honest with yourself. Don't hold on to ideas that you yourself acknowledge are probably wrong simply for their comfort. Don't stick your head in the sand. Assess what you want, and fight for that.

Safeguards and slowing things down will not happen via wishful thinking.

gyanchawdhary

5 days ago

Beautifully said.

tonyhart7

5 days ago

right, lets put those trash code out there to poison LLM lul

addisonj

5 days ago

I absolutely sympathize with this and was/still is my opinion... but the only "evolution" of that is the hope that, while I don't think you can prevent the scams and short-term pain on labor markets... you maybe, actually, genuinely get a tool that helps change some of the dynamics that has led to the absolute discrepancy in power today.

If AI is truly as revolutionary as it could be... well, who is to say it isn't the pandoras box that destabilizes the tech giants today and gets us back to a place where a team of 10 can genuinely compete against 1000. And not in the "raise cash, build fast, and get out while things are good" trend... but actually in building, small, more principled, companies that aren't pushed to do the unsustainable things that current market pushes them to do.

Once again... it is more likely than not to be a pipe-dream... but I am starting to think it may well be better to be realistic about the momentum this freight train is building and see if it can be repurposed for my world-view rather than to cede the space to the worst of the grifters and profit-seeking-at-all-cost types.

pyrale

5 days ago

> If AI is truly as revolutionary as it could be...

My suspicion is that current sophistication of tech and AI is already enough to fulfill gp's predictions, and it's already doing that.

guappa

5 days ago

You can do no profit from AI unless you're providing the AI.

dcow

5 days ago

No, you won’t slow it down. Did you even read the essay, it’s here.

If powering scams and “destabilizing labor” makes something evil then we should delete the internet. Seriously.

krainboltgreene

6 days ago

lmao of course he wrote this.

grey-area

6 days ago

I’d love to see the authors of effusive praise of generative AI like this provide the proof of the unlimited powers of their tools in code. If GAI (or agents, or whatever comes next …) is so effective it should be quite simple to prove that by creating an AI only company and in short order producing huge amounts of serviceable code to do useful things. So far I’ve seen no sign of this, and the best use case seems to be generating text or artwork which fools humans into thinking it has coherent meaning as our minds love to fill gaps and spot patterns even where there are none. It’s also pretty good at reproducing things it has seen with variations - that can be useful.

So far in my experience watching small to medium sized companies try to use it for real work, it has been occasionally useful for exploring apis, odd bits of knowledge etc, but overall wasted more time than it has saved. I see very few signs of progress.

The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.

marxism

6 days ago

I think we're talking past each other. There's always been a threshold: above it, code changes are worth the effort; below it, they sit in backlog purgatory. AI tools so far seem to lower implementation costs, moving the threshold down so more backlog items become viable. The "5x productivity" crowd is excited about this expanded scope, while skeptics correctly note the highest value work hasn't fundamentally changed.

I think what's happening is two groups using "productivity" to mean completely different things: "I can implement 5x more code changes" vs "I generate 5x more business value." Both experiences are real, but they're not the same thing.

https://peoplesgrocers.com/en/writing/ai-productivity-parado...

strken

5 days ago

My friends at companies where AI tools are either mandated or heavily encouraged report that they're seeing a significant rise in low-quality PRs that need to be carefully read and rejected.

A big part of my skepticism is this offloading of responsibility: you can use an AI tool to write large quantities of shitty code and make yourself look superficially productive at the cost of the reviewer. I don't want to review 13 PRs, all of which are secretly AI but pretend to be junior dev output, none of which solve any of the most pressing business problems because they're just pointless noise from the bowels of our backlog, and have that be my day's work.

Such gatekeeping is a distraction from my actual job, which is to turn vague problem descriptions into an actionable spec by wrangling with the business and doing research, and then fix them. The wrangling sees a 0% boost from AI, the research is only sped up slightly, and yeah, maybe the "fixing problems" part of the job will be faster! That's only a fraction of the average day for me, though. If an LLM makes the code I need to review worse, or if it makes people spend time on the kind of busywork that ended up 500 items down in our backlog instead of looking for more impactful tasks, then it's a net negative.

I think what you're missing is the risk, real or imagined, of AI generating 5x more code changes that have overall negative business value. Code's a liability. Changes to it are a risk.

bicx

6 days ago

This is exactly what I’ve experienced. For the top-end high-complexity work I’m responsible for, it often takes a lot more effort and research to write a granular, comprehensive product spec for the LLM than it does to just jump in and do it myself.

On the flip side, it has allowed me to accomplish many lower-complexity backlog projects that I just wouldn’t have even attempted before. It expands productivity on the low end.

I’ve also used it many times to take on quality-of-life tasks that just would have been skipped before (like wrapping utility scripts in a helpful, documented command-line tool).

lolinder

5 days ago

This also accounts for the author of TFA's sense that the smartest people they know are skeptics. Assuming they're being used well, those people spend far more of their time in the high complexity work than they do in the low complexity stuff, so LLMs seem to be more flashy toys than serious tools to them.

bicx

5 days ago

That, or higher level in bigger orgs. I’m fairly senior, but I’m also one of two engineers at a startup. Can’t get away from the low-level work.

lolinder

5 days ago

Bigger org definitely helps, yeah. I'd have loved to have some of these tools available when I was in your shoes!

overfeed

6 days ago

> On the flip side, it has allowed me to accomplish many lower-complexity backlog projects that I just wouldn’t have even attempted before

This has been my experience at well - AI coding tools are like a very persistent junior-- that loves reading specs and documentation. The problem for AI companies is "automated burndown of your low-complexity backlog items" isn't a moneymaker, even though that's what we have. So they have to sell a dream that may be realized, or may not.

The benchmark project in the article is the perfect candidate for AI: well defined requirements with precise technical terms (RFCs), little room for undefined behavior and tons of reference implementations. This is an atypical project. I am confident AI agent write an HTTP2 server, but it will also repeatedly fail to write sensible tests for human/business processes that a junior would excel at.

tonyarkles

5 days ago

I'm currently still somewhat in the AI skeptic camp but you've intrigued me... I'm curious about taking a lesser-known RFC and trying to see what kind of implementation one of the current code-generating models actually comes up with from the spec.

cube2222

6 days ago

I think this is actually a really good point. I was just recently thinking that LLMs are (amongst other things) great for streamlining these boring energy-draining items that "I just want done" and aren't particularly interesting, but at the same time they do very little to help us juggle more complex codebases right now.

Sure, they might help you onboard into a complex codebase, but that's about it.

They help in breadth, not depth, really. And to be clear, to me that's extremely helpful, cause working on "depth" is fun and invigorating, while working on "breadth" is more often than not a slog, which I'm happy to have Claude Code write up a draft for in 15 minutes, review, do a bunch of tweaks, and be done with.

hhhAndrew

5 days ago

+1 this breadth vs depth framing. I notice this in aider itself: What right does that project have to support all those command line options, covering every little detail, and all optionally via Env variables too, and/or yaml file, and .MD docs of them all up to date? Answer: aider itself was clearly used to write all that breadth of features.

yencabulator

6 days ago

You seem to think generating 5x more code results in better code, in the left column. I highly doubt this.

grey-area

6 days ago

Yes there are huge unstated downsides to this approach if this is production code (which prototypes often become).

sbarre

6 days ago

It depends?

There's certainly a lot of code that needs to be written in companies that is simple and straightforward and where LLMs are absolutely capable of generating code as good as your average junior/intermediate developer would have written.

And of course there are higher complexity tasks where the LLM will completely face plant.

So the smart company chooses carefully where to apply the LLM and possibly does get 5x more code that is "better" in the sense that there's 5x more straightforward tickets closed/shipped, which is better than if they had less tickets closed/shipped.

yencabulator

5 days ago

That wasn't the argument. The argument is that someone using an LLM to create 5x more code will achieve things like "Adding robust error handling" and "Cleaner abstractions".

marxism

5 days ago

Just to clarify, when I say 5x code changes, I was thinking of "edit" operations.

My intuition is the tail of low value "changes/edits" will skew fairly code size neutral.

A concrete example from this week "adding robust error handling" in TypeScript.

I ask the LLM to look at these files. See how there is a big try catch, and now I have the code working, there are two pretty different failure domains inside. Can you split up the try catch (which means hoisting some variable declarations outside the block scope).

This is a cursor rule for me `@split-failure-domain.mdc` because of how often this comes up (make some RPCs then validate desired state transition)

Then I update the placeholder comment with my prediction of the failure rate.

I "changed" the code, but the diff is +9/-6.

When I'm working on the higher complexity problems I tend to be closer to the edge of my understanding. Once I get a solution, very often I can simplify the code. There are many many ways to write the same exact program. Fewer make the essential complexity obvious. And when you shift things around in exactly the kind of mechanical transformation way that LLMs can speed up... then your diff is not that big. Might be negative.

boplicity

5 days ago

I'm attempting to vibe code something for the first time. It seems to work, but the amount of cruft being generated is astounding. It's an interesting learning experience, anyways.

AnnaPali

6 days ago

I agree 100%! It's amazing how few people grok this.

sidewndr46

5 days ago

This reminds me of places that try and measure productivity by lines of code

surgical_fire

6 days ago

> The "5x productivity" crowd is excited about this expanded scope, while skeptics correctly note the highest value work hasn't fundamentally changed.

This is true, LLMs can speed up development (some asterisks are required here, but that is generally true).

That said, I've seen, mainly here on HN, so many people hyping it up way beyond this. I've got into arguments here with people claiming it codes at "junior level". Which is an absurd level of bullshit.

sothatsit

5 days ago

Exactly. Juniors may have vastly less breadth of knowledge than an LLM, but they can learn and explore and test software in ways that LLMs cannot.

However, the expansion in scope that senior developers can tackle now will take away work that would ordinarily be given to juniors.

surgical_fire

5 days ago

> And there are many things one junior could be helpful with that a different junior would be useless at.

That maybbe true, and would be an interesting topic do discuss if people actually spoke in such a way.

"Developers are now more productive in a way that many projects may need less developers to keep up productivity levels" is not that catchy to generate hype however.

steego

6 days ago

Approximately speaking, what do you want to see put up?

I ask this because it reads like you have a specific challenge in mind when it comes to generative AI and it sounds like anything short of "proof of the unlimited powers" will fall short of being deemed "useful".

Here's the deal: Reasonable people aren't claiming this stuff is a silver bullet or a panacea. They're not even suggesting it should be used without supervision. It's useful when used by people who understand its limitations and leverage its strengths.

If you want to see how it's been used by someone who was happy with the results, and is willing to share their results, you can scroll down a few stories on the front-page and check the commit history of this project:

https://github.com/cloudflare/workers-oauth-provider/commits...

Now here's the deal: These people aren't trying to prove anything to you. They're just sharing the results of an experiment where a very talented developer used these tools to build something useful.

So let me ask you this: Can we at least agree that these tools can be of some use to talented developers?

grey-area

6 days ago

Yes sure I’ve checked in code generated by AI myself. I’ve not experienced the excitement this article exudes though and it seems very limited in usefulness due to the by now well-documented downsides. Frankly I haven’t bothered using it much recently, it’s just not there yet IME and I’m not sure LLMs ever will be.

What I’m interested in really is just case studies with prompts and code - that’s a lot more interesting for hackers IMO than hype.

hooverd

6 days ago

It's useful, but the promise of every AI company is very explicitly that they will burn the seed corn and choke off the pipeline that created those "very talented" developers who reviewed it!

grey-area

6 days ago

I’m less worried about this as the best way to learn to code is to read as well as write it IMO.

If capabilities don’t improve it’s not replacing anyone, if they do improve and it can write good code, people can learn from reading that.

I don’t see a pathway to improvement though given how these models work.

mrbungie

6 days ago

> Here's the deal: Reasonable people aren't claiming this stuff is a silver bullet or a panacea

This article and vocal supporters are not being reasonable at all, they make a not so between-the-lines separation between skeptics (which are nuts) and supporters ("My smartest friends are blowing it off." in a smug "I'm smarter than my smarter friends").

I mean, come on.

steego

5 days ago

You are absolutely correct. This article and vocal supporters are often not reasonable and I should have made that point.

I honestly found the article to be an insufferably glib and swaggering piece that was written to maximize engagement rather than to engage the subject seriously.

The author clearly values maximizing perceived value with the least amount of effort.

Frankly, I’m tired of reading articles by people who can’t be bothered to present the arguments of the people they’re disagreeing with honestly and I just gave up halfway reading it because it was so grating.

mbesto

6 days ago

> Reasonable people aren't claiming this stuff is a silver bullet or a panacea.

Are you saying the CEO of Anthropic isn't reasonable? or Klarna?

sbarre

5 days ago

The CEO of Anthropic is the least reasonable person in this discussion.

Surely you can see how insanely biased all of their statements would be. They are literally selling the shovels in this gold rush.

Anything they say will be in service of promoting AI, even the bad/cautionary stuff because they know there's an audience who will take it the other way (or will choose to jump in to not be left behind), and also news is news, it keeps people talking about AI.

tptacek

5 days ago

For the record, as the author of this piece, I do not think anybody should factor what the CEO of Anthropic thinks into their decisions. There is in fact a section on this argument in the post. It's short, so it's easy to miss.

NeutralCrane

5 days ago

Of course not. CEOs are Chief Narrative Officers, one of their main functions is to craft and push a message (which is different than collating and reporting facts). Reason doesn’t not factor in.

jcon321

6 days ago

I think that experiment was very cool, but I will say that the OAuth2.0/OIDC protocol is very well documented and there are tons of tools already built around it in multiple languages.

I implemented the OAuth2.0 protocol in 3 different languages without a 3rd party library - entire spec implemented by hand. This was like ~2015 when many of the libraries that exist today didn't back then. I did this as a junior developer for multiple enterprise applications. At the end of the day it's not really that impressive.

MrDarcy

6 days ago

Three weeks ago I did basically the same thing as the author of the Cloudflare story, but I did it with my own open source tool. I went into the experiment treating Claude Code as a junior engineer and guiding it on a feature I wanted implemented.

In a single Saturday the LLM delivered the feature to my spec, passing my initial test cases, adding more tests, etc…

I went to bed that night feeling viscerally in my bones I was pairing with and guiding a senior engineer not a junior. The feature was delivered in one day and would have taken me a week to do myself.

I think stories like the Cloudflare story are happening all over right now. Staff level engineers are testing hypotheses and being surprised at the results.

Oauth 2.0 doesn’t really matter. If you can guide the model and clearly express requirements, boundaries, and context, then it’s likely to be very useful and valuable in its current form.

dcre

5 days ago

This is a great example of how no example provided is ever good enough. There’s always an argument that it doesn’t really count. Yet you just said the computer is doing what you did as a junior developer.

kmacdough

5 days ago

It's not supposed to be impressive. It's a faster way to do the unimpressive stuff. Which is the bulk of real-world software work at most companies.

Maybe you just have that dream job where you only have to think hard thoughts. But that's just not the norm, even at a bleeding edge startup.

ilaksh

5 days ago

Exactly how long did it take you? And now how much actual time was spent in the comparison prompting and code review by Cloudflare?

citizenpaul

6 days ago

Yeah exactly.

Whats nuts is watching all these people shill for something that we all have used to mediocre results. Obviously Fly.io benefits if people start hosting tons of slopped together AI projects on their platform.

Its kinda sad to watch what I thought was a good company shill for AI. Even if they are not directly getting money from some PR contract.

We must not be prompting hard enough....

simonw

6 days ago

Saying "this tool is genuinely useful to me and it's baffling how many people refuse to acknowledge that could possible be true" is not a sign that someone is being paid to "shill for AI".

(If it is then damn, I've been leaving a ton of money on the table.)

kalkin

6 days ago

Some days it seems like the most vulgar Marxist would be more rigorous about doing a who-benefits analysis before using that to dismiss an argument than the median HN commenter. Fly.io benefits from AI hype because... more slop code gets written and then run on their servers? tptacek would burn his HN cred for that? Really?

jsheard

6 days ago

> Fly.io benefits from AI hype because... more slop code gets written and then run on their servers?

I don't know if that's what fly.io is going for here, but their competitors are explicitly leaning into that angle so it's not that implausible. Vercel is even vertically integrating the slop-to-prod pipeline with v0.

kalkin

6 days ago

As far as I can see this only makes sense if they actually believe that AI will accelerate production code that people will want to pay to keep running. Maybe they're wrong about that, but it's got to be sincere, or the strategy would be incoherent.

citizenpaul

5 days ago

There are a lot of people whom's brain warning signals shut off when they think they are onto some money making idea. They will spend vastly illogicall amounts to "keep it going" often for years before admiting it's not working.

Look at stock market courses for instance. They are endlessly prevalent, an eternally green scam. People spend thousands to lose even more money all the time. Sunk cost fallacy is very hard for a lot of people to overcome. Scammers count on it. There is literally millions to be made in these scams if you have zero moral fiber and zero shame.

We are in a golden age of such scams. Not my quote but one article I read said something like business students right now are putting Adam Neumann's picture on their dorm walls to aspire to be like him...

blibble

6 days ago

> Whats nuts is watching all these people shill for something that we all have used to mediocre results.

this sort of post is the start of next phase in the battle for mindshare

the tools are at the very best mediocre replacements for google, and the people with a vested interest in promoting them know this, so they switch to attacking critics of the approach

> Its kinda sad to watch what I thought was a good company shill for AI.

yeah, I was sad too, then I scrolled up and saw the author. double sadness.

CuriouslyC

6 days ago

If you really think that feel free to continue with business as usual. I just hope you're not at a stack ranking company, or you are politically savvy though, because otherwise you're going to be in for a real shock in the next few years as your peers build their AI skills, tooling matures and models improve. A skilled dev with a well tuned agentic workflow can already finish non-trivial 5k LoC projects in a day, complete with copious tests and documentation, just imagine when the ecosystem has matured and the majority of your coworkers are hip to the game.

citizenpaul

5 days ago

>hip to the game.

Thats the crux.

copious tests - That don't work but no one cares.

documentation - That no one has or ever will read, and is hilariously inaccurate.

There is a lot of software pre AI that is churned out because some manager wanted exactly what they wanted but it had no purpose or need. I expect that to explode in the coming years for sure. I'm not afraid of AI, its merely ok, another tool is all.

It will allow companies to dig themselves very deep holes. Devs wise to the game will be able to charge astronomical fees to empty the pools filled with the AI sewage they have been filled with.

jagged-chisel

6 days ago

> … if it’s so great, stop telling us and show …

If you’re selling shovels to gold miners, you don’t need to demonstrate the shovel - you just need decent marketing to convince people there’s gold in them thar hills.

XorNot

6 days ago

This is actually a great metaphor and phrasing and I'm filing it away for later btw.

pera

5 days ago

It's a bit more than a metaphor :) during the California gold rush there was this guy named Sam Brannan who sold shovels and other tools to miners, and made a fortune from it (he is often referred to as California's first millionaire). He also had a newspaper at the time, the California Star, which as you can imagine was used to promote the gold rush:

> The excitement and enthusiasm of Gold Washing still continues—increases. (1848)

https://sfmuseum.org/hist6/star.html

https://en.wikipedia.org/wiki/Samuel_Brannan

tsimionescu

6 days ago

Note that it's a pretty common cliche, usually phrased something like "in a gold rush, the only people guaranteed to make money are the guys selling the shovels".

kwertyoowiyop

6 days ago

It’s actually an offbeat take on that common cliche.

rybosome

6 days ago

Many, many people are in fact “using the code it generated on its own”. I’ve been putting LLM-assisted PRs into production for months.

With no disrespect meant, if you’re unable to find utility in these tools, then you aren’t using them correctly.

photonthug

6 days ago

> unable to find utility.. aren’t using them correctly

Tfa makes this argument too then later says:

> All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument

So reasonable people admit that the utility depends on the use case.. then at the same time say you must be an idiot if you aren’t using the tools. But.. this isn’t actually a reasonable position.

Part of the issue here may be that so many programmers have no idea what programmers do outside of their niche, and how diverse programming actually is.

The typical rebuttals of how “not everyone is doing cliche CRUD web dev” is just the beginning. Author mentions kernel dev, but then probably extrapolated to C dev in general. But that would be insane, just think about the training sets for Linux kernel dev vs everything else..

It’s dumb to have everyone double down on polarizing simplistic pro/con camps, and it’s rare to see people even asking “what kind of work are you trying to do” before the same old pro/con arguments start flying again.

rybosome

5 days ago

Yeah, very much agreed that in the spirit of good discussion I should’ve at least asked about their experiences and use case before jumping to finger wagging.

But that said, let me reiterate a couple important points from my post:

> With no disrespect meant

I’m not calling anybody an idiot because they aren’t using an LLM. I’m sharing my honest opinion that they’re not using it correctly, but that’s very different than calling them an idiot.

> if you’re unable to find utility in these tools

This is a bit lawyerly, but note my carefully generic wording here: “find utility”. If you’re a Rust developer who doesn’t like the Rust output from your LLM, sure - but that’s not 100% of the job.

You’ll also touch bash scripts, make files, YAML, JSON or TOML config, write bug reports/feature requests, discuss architectural ideas and coding patterns, look through stack traces/dumps/error logs, or whatever else.

My point is that it is exceedingly unlikely that there is nothing an LLM can do to help your work, even if it’s not good at writing code in your domain.

Hence the statement that if you cannot find utility, you’re not using it correctly. It takes time to learn how to use these tools effectively, even in domains they excel in.

superconduct123

5 days ago

Its such an insane argument

Its like I can't just switch our whole 1-million line codebase on a dime

These articles act like everyone is just cranking out shitty new webapps, as if every software job is the same as the author's

douglasisshiny

6 days ago

Okay, how am I supposed to use them "correctly"? Because me explaining step by step, more so than a junior developer, how to do a small task in an existing codebase for it to get it wrong not once, not twice, not three times, but more is not a productivity boost.

And here's the difference between someone like me and an LLM: I can learn and retain information. If you don't understand this, you don't have a correct understanding of LLMs.

simonw

6 days ago

It is entirely true that current LLMs do not learn from their mistakes, and that is a difference between eg an LLM and a human intern.

It is us, the users of the LLMs, that need to learn from those mistakes.

If you prompt an LLM and it makes a mistake, you have to learn not to prompt it in the same way in the future.

It takes a lot of time and experimentation to find the prompting patterns that work.

My current favorite tactic is to dump sizable amounts of example code into the models every time I use them. I find this works extremely well. I will take code that I wrote previously that accomplishes a similar task, drop that in and describe what I want it to build next.

recursive

5 days ago

You seem to be assuming that the thing I'm learning is not "Stop using LLMs for this kind of work".

lando2319

6 days ago

yep I've used Devon and now Google Jules, for the big stuff, it has lots of wrong code, but it still end up giving my a much better start than starting from scratch certainly. When it all comes together it give me a 6X boost. But def fixing all the wrong code and thoroughly testing it is the time consuming part.

surgical_fire

6 days ago

> LLM-assisted PRs

This does not counter what GP said. Using LLM as a code assistant is not the same as "I don't need to hire developers because LLMs code in their place"

detaro

6 days ago

Which one is the article talking about?

liveoneggs

6 days ago

The first paragraph of this blog calls senior skeptical developers "unserious" but the person who wrote it has a cartoon gopher as his avatar.

The next few paragraphs basically say "the tool run arbitrary programs on your machine, pull in arbitrary files, and use that to run more arbitrary commands" and then blames you for thinking that is a bad sequence of events.

In the best possible light I (an AI-neutral reader) can paint this rant on a hosting-company blog (why publish this?) is that 1) allowing random textbots to execute programs on your work computer is good (disagree), 2) those chatbots do, in fact, occasionally say enough correct-ish things that they are probably worth your company paying $20+/month for your access (agree).

sethherr

6 days ago

I think tptacek is generally worth reading. He is one of the users with the highest karma on this site (https://news.ycombinator.com/user?id=tptacek)

I’m happy to have read this, which is reason enough to publish it - but also it’s clearly generating debate so it seems like a very good thing to have published.

lolinder

5 days ago

I generally like tptacek, but this piece has major flaws. His dismissal of all copyright concerns on the grounds that some developers engage in piracy is horribly lazy.

chinchilla2020

5 days ago

High karma in an internet community is not something I respect automatically. Pewdiepie and every other little online personality have tons of followers and likes.

tptacek has always come across arrogant, juvenile, opinionated, and difficult to work with.

liveoneggs

5 days ago

co-mingling your work identity with your gopher-identity is a dangerous game; I should clarify the gopher is not UMN-related.

yunwal

5 days ago

1 is not a requirement for using agents. You give agents access to a specific workspace and set of tools that you know are safe, similar to how you give the junior dev access to scratch but not prod. Feels like this comment is not in good faith if I’m being honest.

liveoneggs

5 days ago

According to the article you are not a serious user of LLMs and are as out of date as the curmudgeon straw men being attacked.

There are zero "safe" tools where you don't control the inputs.

SatvikBeri

6 days ago

I don't think I would notice a 100% improvement in software productivity in most companies, from the outside. Most of the time, that would just translate to the company being able to hire fewer developers, and having slightly higher profit margins – but not enormously higher, because developers are only one part.

I recently used Claude Code to develop & merge an optimization that will save about $4,000 a month. It was relatively simple but tedious, so I probably wouldn't have done it on my own. I don't even expect most of my coworkers to notice.

lelandbatey

6 days ago

If you read post, the article is mostly agreeing with you. What they're pointing out is not "the AI can do everything you do", it's that "an AI coder can do a lot of the boring typing a lot faster than you, leaving you right at the point of 'real implementation'".

Having something else write a lot of the boring code that you'll need and then you finish up the final touches, that's amazing and a huge accelerator (so they claim).

The claim is not "AI will replace us all", the claim of the parent article is "AI is a big deal and will change how we work, the same way IDEs/copy-paste/autocomplete/online documentation have radically changed our work."

Seb-C

5 days ago

> an AI coder can do a lot of the boring typing a lot faster than you, leaving you right at the point of 'real implementation

Vim and bash solved that for me a long time ago in a more reliable and efficient way (and it's certainly not the only tool capable of that).

> the same way IDEs/copy-paste/autocomplete/online documentation have radically changed our work

I was there before and went in the autocomplete/lsp thing pretty late (because Vim didn't have good lsp support for a long time, and Vim without it was still making me more efficient than any other IDE with it). Those things didn't radically change our work as you claim, it just made us a bit more productive.

johnfn

6 days ago

Who said "unlimited powers", other than you? That seems like an intentional distortion of the article, which says "LLMs can write a large fraction of all the tedious code you’ll ever need to write". Doing most of my tedious work is very different than "unlimited powers".

I use AI to chew through tedious work all the time. In fact, I let an agent do some work just before I checked HN to read your claim that it can't do that. Everyone at my job does the same, perhaps modulo checking HN. But there's no 'unlimited power' to show you - we're just about 30% faster than we used to be.

Retric

5 days ago

You might be 30% faster, but not everyone is doing your job specific with your exact skill set. I suspect the range is like -5% to 50%, the question nobody can answer is what the average is because things are so diverse and nobody is collecting good data.

In decades of programming I’ve written very little tedious code, but that’s as much about the projects I’ve worked on as approach I use.

pj_mukh

6 days ago

I think this is a misunderstanding coder productivity. A 10x engineer isn't 10x faster at popping out Unit tests, that stuff is mind-numbingly boring that turns out a next token predictor can do it with ease. In fact I would guess that really "productive" software engineers, slow down considerably when forced to do this important but slow work*.

The 10x engineer is really good at deducing the next most important thing to do is and doing it quickly. This involves quickly moving past 100's of design decisions in a week to deliver something quickly. It requires you to think partly like a product manager and partly like a senior engineer but that's the game and LLM's are zero help there.

Most engineering productivity is probably locked up in this. So yes, LLM's probably help a lot, just not in the way that would show on some Jira board?

*One could claim that doing this slow work gives the brain a break to then be good at strategizing the higher order more important work. Not sure.

protocolture

6 days ago

>I’d love to see the authors of effusive praise of generative AI like this

He spent a large tranche of the article specifically hanging a lantern on how mediocre the output is.

>by creating an AI only company

He specifically says that you need to review the code over and over and over.

Karrot_Kream

6 days ago

This 2 year old Goroutine pool implementation [1] is 95% GPT generated and has commit history showing what GPT did. It's an older example, but it is one.

[1]: https://github.com/devchat-ai/gopool

mvdtnz

6 days ago

> If GAI (or agents, or whatever comes next …) is so effective it should be quite simple to prove that by creating an AI only company and in short order producing huge amounts of serviceable code to do useful things.

I don't think this follows. Anyone can see that 10-ton excavator is hundreds or even thousands of times more efficient than a man with a shovel. That doesn't mean you can start a company up staffed only with excavators. Firstly you obviously need people operating the excavator. Secondly the excavator is incredibly efficient at moving lots of dirt around, but no crew could perform any non-trivial job without all the tasks that the excavator is not good out - planning, loading/unloading, prepping the site, fine work (shovelling dirt around pipes and wires), etc.

AI is a tool. It will mean companies can run much leaner. This doesn't imply they can do everything a company needs to do.

bberenberg

6 days ago

I build tools all the time using AI tools that would have taken me 10x longer before. A couple examples below. The first example was a literal one shot of my dumping the full page html into cursor and asking it for the feature I needed. In using it today it already saved me over an hour of analysis time across the refunds I analyze. The second example includes using AI for better classifying text and having our support people better use their time to help customers. I could have probably made the second thing without LLMs but I didn’t have the skills. Here I was able to provide concrete value to our company and our customers.

https://github.com/Atlas-Authority/mpac-ui-improved https://moduscreate.com/blog/forum-monitoring-is-essential-b... (Pardon how marketing keyword stuffed the final post)

keybored

6 days ago

> The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.

I’m open to that happening. I mean them showing me. I’m less open to the Nth “aww shucks, the very few doubters that are left at this point are about to get a rude awakening” FOMO concern trolling. I mean I guess it’s nice for me that you are so concerned about my well-being, soon to be suffering-being?

Now, AI can do a lot of things. Don’t get me wrong. It has probably written a million variations on the above sentiment.

avanai

6 days ago

A "eulogy" is a speech you make at a funeral in honor of the dead person. I think you meant "apology".

tsimionescu

6 days ago

While that is the most common sense of eulogy, it's not the only one. A eulogy is also any speech that highly praises someone or something - which is most commonly done at funerals, which is how the funeral association came about (also probably by association with an elegy, which is an etymologically unrelated word that refers to a Greek poem dedicated to someone who passed away).

In many romance languages, eulogy doesn't have the funeral connotation, only the high praise one - so the GP may be a native speaker of a romance language who didn't realize this meaning is less common in English.

grey-area

6 days ago

Yes I think I was thinking more a paean or apology though not sure apology is used in that sense much nowadays - perhaps apologia is clearer. In praise of would be better, thanks will edit just now.

antithesizer

6 days ago

The Greek transliteration "apologia" is often used for that sense of "apology" to skirt any ambiguity.

selimthegrim

6 days ago

Encomium

ofjcihen

6 days ago

Honestly it’s really unfortunate that LLMs seem to have picked up the same hype men that attached themselves to blockchains etc.

LLMs are very useful. I use them as a better way to search the web, generate some code that I know I can debug but don’t want to write and as a way to conversationally interact with data.

The problem is the hype machine has set expectations so high and refused criticism to the point where LLMs can’t possibly measure up. This creates the divide we see here.

busymom0

6 days ago

I think LLM hype is more deserved and different from that of blockchain.

There's still a significant barrier to entry to get involved with blockchain and most people don't even know what it is.

LLMs on the other hand have very low barrier to at least use- one can just go to google, ChatGPT etc and use it and see its effectiveness. There's a reason why in the last year, a significant portion of school students are now using LLMs to cheat. Blockchains still don't have that kind of utilization.

ofjcihen

6 days ago

I agree with all of these points.

Honestly I think that makes the argument stronger though that it’s unfortunate they jumped on.

vohk

6 days ago

I think I agree with the general thrust but I have to say I've yet to be impressed with LLMs for web search. I think part of that comes from most people using Google as the benchmark, which has been hot garbage for years now. It's not hard to be better than having to dig 3 sponsored results deep to get started parsing the list of SEO spam, let alone the thing you were actually searching for.

But compared to using Kagi, I've found found LLMs end up wasting more of my time by returning a superficial survey with frequent oversights and mistakes. At the final tally I've still found it faster to just do it myself.

I will say I do love LLMs for getting a better idea of what to search for, and for picking details out of larger blocks.

magicalhippo

5 days ago

I find search engines like Google and Bing are so overly keen on displaying any results that they'll ignore your search parameters and return something else instead.

Thus, I find LLMs quite useful when trying to find info on niches that are close to a very popular topic, but different in some key way that's hard to express in search terms that won't get ignored.

jcranmer

6 days ago

> I think part of that comes from most people using Google as the benchmark, which has been hot garbage for years now.

Honestly, I think part of the decline of Google Search is because it's trying to increase the amount of AI in search.

taormina

6 days ago

It’s part of it. It was on the decline and then for some reason Google decided that the best way to handle their dumpster fire was to chuck gasoline cans into it.

antithesizer

6 days ago

There's not much riding on convincing the broader public that AI is the real deal before it's proved itself beyond the shadow of any doubt. There's nothing they can do to prepare at this point.

idan

6 days ago

This has all the energy of people saying "ah, you take such great photos, you must have a good camera"

_People_ are getting outsized value from AI in the ways they apply it. Photographs come from the photographer, not the camera.

the__alchemist

6 days ago

I suspect it comes down to use case. Those you refer to use it for well-trodden areas; likely web UIs and servers using popular tools. The image they predict is extrapolated from these.

NotAnOtter

6 days ago

The author's central argument seems to be that the current state of LLM development is such that 1 Senior + LLM === 1 Senior + 4 juniors

With that as a metric, 1 Senior + 4 juniors cannot build the company with the scope you are describing.

A 50-eng company might have 1 CTO, 5 staff, 15 Seniors, and 29 juniors. So the proposition is you could cut the company in ~half but would still require the most-expensive aspects of running a company.

quantadev

6 days ago

Even if you have one trillion juniors, you're not going to get them to spit out 1000 lines of code in a matter of seconds, after you give them an assignment, like LLMs do, so this is a faulty comparison.

Also often it takes a senior dev _more_ time to _explain_ to a junior what needs to be done than it takes to do it himself. What LLMs give us is the ability to generate a feature about as fast as we can type up the instructions we would have, pre-AI, given to a junior dev.

surgical_fire

6 days ago

> The author's central argument seems to be that the current state of LLM development is such that 1 Senior + LLM === 1 Senior + 4 juniors

This is such an outlandish claim, to the point where I call it plain bullshit.

LLMs are useful in a completely different way that a Junior developer is. It is an apples and oranges comparison.

LLMs does things in some way that it helps me beyong what a Junior would. It also is completely useless to perform many tasks that a Junior developer can.

NotAnOtter

6 days ago

I disagree that they are so different you can't compare them.

Imagine a senior IC staffed with 4 juniors, and they spend 2 hours with each every day. Then the junior is left with 6 hours to think through what they were taught/told. This is very similar to LLM development except instead of context switching 3 times each day, the senior can skip over the 6 hours of independent time the junior required to absorb the changes. But it still takes the same amount of time to deliver the 4 separate projects.

I find the existence of LLM development deeply troubling for a long list of reasons. But refuting the claim that an LLM is similar in many ways to a junior dev is unsubstantiated

>It also is completely useless to perform many tasks that a Junior developer can.

And there are many things one junior could be helpful with that a different junior would be useless at.

surgical_fire

5 days ago

> Imagine a senior IC staffed with 4 juniors, and they spend 2 hours with each every day. Then the junior is left with 6 hours to think through what they were taught/told. This is very similar to LLM development except instead of context switching 3 times each day, the senior can skip over the 6 hours of independent time the junior required to absorb the changes.

This is not how Juniors work. I don't know what else to say. It is just not true.

I don't give juniors a prompt and let them to implement code for a few hours. They work as any other developer, just generally in features and/or tickets of more limited scope. At least initially. This is not what LLMs do

> But refuting the claim that an LLM is similar in many ways to a junior dev is unsubstantiated

I sometimes get the feeling I talk to people who never worked in a real professional setting.

A LLM can do things that Juniors can't. When I bounce around ideas for implementing a certain feature, when I explore libraries or frameworks I am unfamiliar with, when I ask it to review pieces of code looking for improvements, when I get it to generate boring glue code, scaffolding, unit tests. All those things are helpful, and make LLMs an excellent code assistant in a way that Juniors are not.

But it is completely unable to properly do things without me giving very precise instructions of what it needs to code. The less precise I am, the worse its output. It is very happy to generate completely bullshit code that kinda looks like it does what I need but not really. I constantly need to tweak what it generates, and although it saves my time as it outputs a lot of code in little time, the results are very unreliable to meaningfully act with any sort of independence.

> And there are many things one junior could be helpful with that a different junior would be useless at

Which completely fails to address the point I am making.

NotAnOtter

5 days ago

I'm not saying that is a typical workflow. I'm saying A ~= B, and B ~= C.

A - normal, conventional senior dev work flow

B - A non-traditional but plausible senior dev workflow

C - Senior dev with LLM

I'm not claiming A = B = C, just that the step from one to another is relatively small when compared to something like a linter or other tool that accelerates development in some way.

> Which completely fails to address the point I am making.

If an LLM can do 20% more things than a junior dev can but also cannot do a different 20% of things a junior dev can, then the LLM is similar the a junior. And when comparing two juniors within a given field, it's entirely likely the above logic could apply. E.g. one junior may not be able to write SQL queries while the other does not understand the DOM. An LLM on the other hand is "kinda ok" at everything, but cannot engage in 'real' conversations about system architecture or reason through user experiences reliably. So it can do 20% more, and also 20% less. Just as a junior dev.

No one is claiming A = C, so don't keep punching the wind. They are claiming A ~= C.

surgical_fire

5 days ago

> If an LLM can do 20% more things than a junior dev can but also cannot do a different 20% of things a junior dev can

This is not how it works. I am only punching wind because your arguments are as solid as air.

Comparing an LLM to a Junior Dev is like comparing a Secretary to an Answering Machine - both can technically answer calls, so they must be somewhat similar? What a load of hot, steaming bullshit.

> No one is claiming A = C, so don't keep punching the wind. They are claiming A ~= C.

And I am claiming that A != C. I was not even arguing against they are equal, I am arguing against them being similar in any way.

I maintain what I said before, I sincerely doubt the competence and experience in real life professional setting of anyone claiming LLMs are in anyway remotely similar to junior devs

NotAnOtter

5 days ago

> Comparing an LLM to a Junior Dev is like comparing a Secretary to an Answering Machine

The number of secretaries declines after answering machines became more prevalent.

You can keep throwing ad hominin if you think it's helping change reality. I wish we weren't headed this way, I really do, but we are. So we might as well confront reality. Catch you in 3 years when whatever happens - happens.

surgical_fire

5 days ago

> The number of secretaries declines after answering machines became more prevalent.

Citation needed

> Catch you in 3 years when whatever happens - happens.

I wish I had 1 euro for whenever crypto shills told me those exact words regarding the inevitably of cryptocurrency replacing actual currency.

Note that I said 1 euro and not 1 of some shitcoint.

paulcole

5 days ago

> The time has come for llm users to put up or shut up - if it’s so great, stop telling us and show and use the code it generated on its own.

Why would we do this? Wouldn’t it be better to do this silently and reap the benefits?

rukuu001

5 days ago

Re put up or shut up - this is a decent example from the front page today: https://news.ycombinator.com/item?id=44159166

georgemcbay

6 days ago

I don't know if you are the same (S.G.) greyarea I'm familiar with but I hope so because the idea of having a couple of 90s era irc people take opposing viewpoints on LLMs in 2025 amuses me.

crazygringo

6 days ago

That's a strawman.

Nobody is saying it's "unlimited powers", that's your exaggeration.

And what you're proposing about an "AI only company" seems to be based on your misunderstanding.

What this article is saying is, you need the same number of senior developers, but now each one is essentially assisted by a few junior developers virtually for free.

That's huge. But saying you want to see an "AI only company" as "proof" has nothing to do with that.

And what you're describing -- "occasionally useful for exploring apis, odd bits of knowledge etc, but overall wasted more time than it has saved" -- is exactly what the author explicitly addresses at the top:

> If you were trying and failing to use an LLM for code 6 months ago, you’re not doing what most serious LLM-assisted coders are doing. People coding with LLMs today use agents...

The entire article is about how to use LLM's effectively. What kind of "proof" do you really want, when the article explains it all awfully clearly?

magarnicle

6 days ago

From what I've seen, it takes a deeply original thinker to get something great out of generative AI. But if you're a deeply original thinker, you're probably exceptional regardless of whether you are using AI or not.

Generative AI is too much of a blank canvas at the moment, and one that is always shifting. It's up to the user to find all the use cases, and even then in my experience it's just as likely to send me on a wild goose chase as it is to instantly solve my problem.

sanderjd

6 days ago

What kind of proof are you looking for here, exactly? Lots of businesses are successfully using AI... There are many anecdotes of this, which you can read here, or even in the article you commented on.

What else are you looking for?

frank_nitti

6 days ago

What do you mean by “successfully using AI”, do you just mean some employee used it and found it helpful at some stage of their dev process, e.g. in lieu of search engines or existing codegen tooling?

Are there any examples of businesses deploying production-ready, nontrivial code changes without a human spending a comparable (or much greater) amount of time as they’d have needed to with the existing SOTA dev tooling outside of LLMs?

That’s my interpretation of the question at hand. In my experience, LLMs have been very useful for developers who don’t know where to start on a particular task, or need to generate some trivial boilerplate code. But on nearly every occasion of the former, the code/scripts need to be heavily audited and revised by an experienced engineer before it’s ready to deploy for real.

sanderjd

6 days ago

Yeah, I should have posted the first version of my post, pointing out that the problem with this demand for proof (as is often the case) devolves into boring definitional questions.

I don't understand why you think "the code needs to be audited and revised" is a failure.

Nothing in the OP relies on it being possible for LLMs to build and deploy software unsupervised. It really seems like a non sequitur to me, to ask for proof of this.

frank_nitti

5 days ago

That’s fair regarding the OP, and if otherwise agree with your sentiments here.

Some other threads of conversation get intertwined here with concerns about delusional management making decisions to cut staff and reduce hiring for junior positions, on the strength of the promises by AI vendors and their paid/voluntary shills

For many like me who have encouraged sharp young people learn computers, we are watching their spirits crushed by this narrative and have a strong urge to push back — we still need new humans to learn how computer systems actually work, and if nobody is willing to pay them for work because an LLM outperforms them on those menial “rite-of-passage” types of software construction, we will find ourselves in a bad place

sanderjd

4 days ago

Yep, I definitely relate to your final paragraph!

Hammershaft

6 days ago

I'd like to see any actual case studies. So far I have only heard vague hype.

droidist2

5 days ago

Software engineering is always light on case studies though, for instance test driven development, or static vs. dynamic typing, people have been debating these for quite a long time.

soraminazuki

4 days ago

At least actual examples are underwhelming.

https://news.ycombinator.com/item?id=44050152

ewild

6 days ago

i mean i can state that i built a company wihtin the last year where id say 95% of my code involved using an LLM. I am an experienced dev so yes it makes mistakes and it requires my expertise to be sure the code works and to fix subtle bugs; however, i built this company me and 2 others in about 7 months for what wouldve easily taken me 3 years without the aid of LLMs. Is that an indictment of my ability? maybe, but we are doing quite well for ourselves at 3M arr already on only 200k expense.

grey-area

6 days ago

That’s genuinely far more interesting and exciting to me (and I’m sure others too) than this sort of breathless provocation, esp if code and prompts etc are shared. Have you written about it?

ewild

5 days ago

i have not written about it due to me being a silent partner in the company and dont want my name publicly attatched to it, but the code and prompts is all more like talking to a buddy is how i use it, i ask it to build specific things then i look through and make changes. For instance a few examples i can give is there is a lot of graph traversal in my data i built, I'm not an expert on graph traversal, so I researched what would be a good algo for my type of data, and then utilized claude to implement the papers algorithm into my code and data structures. I dont have the llm in any steps that the customer interact with (there is some fuzzy stuff but nothing consistently run) but i would say an llm has touched over 90% of the code i wrote. its just an upgraded rubber ducky to me.

If i wasn't experienced in computer science this would all fall apart however i do have to fix almost all the code, but spending 10 mins fixing something is better than 3 days figuring it out in the first place (again this might be more unique to my coding and learning style)

sanderjd

6 days ago

Do you find that these kinds of anecdotes are hard to find?

sanderjd

3 days ago

There are lots of concrete examples in this article: https://www.semafor.com/article/06/03/2025/how-vibe-coding-i...

Do I think those rise to "case studies"? No. But to another commenters point, detailed and rigorous case studies have always been hard to come by for any productivity process or technology.

I also think that article is hype, but it's not true that it's vague.

ghostly_s

6 days ago

Did you even glance at the link? The author is advocating for a human-supervised LLM agent workflow.

cube2222

6 days ago

Maybe they're just busy shipping code instead of proving stuff to you, which they stand nothing to gain from?

You set up a strawman (AI only companies, agents doing everything on their own) which is irrelevant to the point the article is making. One excerpt:

> Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge! But I don’t care. I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.

I think this article is very on point, I relate with basically every paragraph. It's not a panacea, it's not a 10x improvement by any means, but it's a very meaningful improvement to both productivity (less than 2x I'd say, which would already be a ton) and fun for me. As I've mentioned in the past here

> I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.

which is a point the article makes too (tables), in a slightly different way.

Also, to be clear, I agree that 90% of the marketing around AI is overblown BS. But that's again beside the point, and the article is making no outlandish claims of that kind.

Overall, I hope this article (as intended) will make more people lose their dismissiveness and wake up their curiosity, as I expect the future of those is akin to that of people today saying they're "not really good at computers". It's a paradigm-shift, and it takes getting used to and productive in, as some imo smart people are mentioning even in this thread[0].

[0]: https://news.ycombinator.com/item?id=44164039

mcphage

6 days ago

> Maybe they're just busy shipping code instead of proving stuff to you, which they stand nothing to gain from?

Well, in this case they’re busy writing articles trying to convince us, instead of proving stuff to us.

cube2222

5 days ago

Point taken. I’d still argue though that writing an article like this is at least an order of magnitude easier than proving a, say, 30% productivity boost to outside observers.

heathrow83829

6 days ago

the greatest proof that it's a useful tool is how much people are using it. Almost everyone is using it. If you're using it then you probably find it useful as well.

runarberg

5 days ago

a) If “everyone is using it”, show me the numbers. Doing a developer survey to see how many are using it is not hard. Ask a random sample of developers if they ⅰ use LLMs to aid their standard coding or ⅱ completely replace their coding with LLM output.

b) Even if “everyone is using it” it doesn’t mean it is useful. The usage could be adequately explained by e.g. marketing, being forced on them by management/policy, etc. Not everything with high usage is useful. I can e.g. quickly think of chewing gum (which is also used by a lot of developers), or the ANSI standard keyboard (as opposed to the ISO standard keyboard).

latentsea

6 days ago

And the greatest proof that people are using software is that people are complaining about it.

conradev

6 days ago

Have you used a language model to program yet?

grey-area

6 days ago

Yes sure, I said so in the post, and have watched others try to do so too.

geoduck14

6 days ago

You think that the only code that is valuable is code that is written by a professional SWE.

There are LOADS of people who need "a program" but aren't equipped to write code or hire an SWE that are empowered by this. And example: last week, I saw a PM vibe code several different applications to demo what might get built after it gets prioritized by SWEs

bigger_cheese

5 days ago

I can share my experience I am not a software engineer. I am the "other" type of Engineer (chemical engineer). I tend to use AI in the same way I use google or Stack Overflow, mostly when I get stuck on syntax and my code editor isn't helping me, I know what I am attempting to do "should" be possible but I can't get the syntax right. For example a recent question I asked was

"Python, create an xarray with two dimensions from a pandas df"

It gave me a few lines of example code which was enough for me to figure out where I had messed up the syntax in my own code.

I have seen one of my junior coworkers copy+paste entire chunks of code from chatbot conversations and to be honest what he has produced is underwhelming the code is poorly structured difficult to reason about I have low confidence he understands what the bot has produced (and why it did things the way it did) and I don't have high confidence we'd be able to trust the accuracy of the figures this code was outputting.

grey-area

6 days ago

Not really I’m fine with anyone knocking stuff together but I think people should be aware of the limitations and dangers. Writing like this does nothing to inform and is overly positive IMO.

It’d be like insisting llms will replace authors of novels. In some sense they could but there are serious shortcomings and things like agents etc just don’t fix them.

bloat

6 days ago

So we replace the task of writing tedious boilerplate with the task of reading the AI's tedious boilerplate. Which takes just as long. And leaves you with less understanding. And is more boring.

mostlysimilar

6 days ago

All of these people advocating for AI software dev are effectively saying they would prefer to review code instead of write it. To each their own I guess but that just sounds like torture to me.

SchemaLoad

5 days ago

It's because these people don't know how to write it, think they know how to review it. Ship a todo list app in a day, and then write blog posts about how they are changing the world.

bradly

5 days ago

I'm not familiar with the author's work. What % of their time is spent writing code?

ethagnawl

5 days ago

The thought alone makes me want to hang up my (professional) keyboard and open a nursery/gardening center.

Mofpofjis

6 days ago

well said!

Philpax

6 days ago

You are either a very fast producer or a very slow reader. Claude and Gemini are much faster at producing code than I am, and reviewing their code - twice over, even - still takes less time than writing it myself.

seadan83

6 days ago

Reviewing code is often slower than writing it. You don't have to be an exceptionally fast coder or slow reviewer for that to be true.

tart-lemonade

6 days ago

The amount of time I spend going back and forth between the implementation and the test cases to verify that the tests actually fully cover the possible failure cases alone can easily exceed the time spent writing it, and that's assuming I don't pull the branch locally and start stepping through it in the debugger.

The idea that AI will make development faster because it eliminates the boring stuff seems quite bold because until we have AGI, someone still needs to verify the output, and code review tends to be even more tedious than writing boilerplate unless you're speed-reading through reviews.

thegeomaster

6 days ago

If this was the case, regular code review as a practice would be entirely unworkable.

Mofpofjis

6 days ago

"regular" code review is indeed a total theater, a travesty, a farce.

Real, meticulous code review takes absolutely forever.

JacobThreeThree

5 days ago

This speaks to the low quality assurance bar that most of the software industry lives by.

If you're programming for a plane's avionics, as an example, the quality assurance bar is much, much higher. To the point where any time-saving benefits of using an LLM are most likely dwarfed by the time it takes to review and test the code.

It's easy to say LLM is a game-changer when there are no lives at stake, and therefore the cost of any errors is extremely low, and little to no QA occurs prior to being pushed to production.

seadan83

6 days ago

Interesting point! I'd like to explore this a bit more.

Would you mind going into a bit more specifics/details on why regular code review practice would become unworkable, like which specific part(s) of it?

recursive

5 days ago

Huh? Why? How? Say the code takes one day to write, and two days to review. What about that is unworkable?

oblio

6 days ago

Are you, though? Reading code is harder, potentially much harder.[1]

And I suspect the act of writing it yourself imparts some lower level knowledge you don't get by skimming the output of an AI.

[1] https://www.joelonsoftware.com/2000/05/26/reading-code-is-li...

KyleBerezin

6 days ago

I think he is specifically referring to boilerplate code. It is not hard to understand boilerplate code.

ckiely

6 days ago

But you definitely don't understand it nearly as well as if you wrote it. And you're the one that needs to take responsibility for adding it to your codebase.

mewse

5 days ago

In this thread, we pretend that the difficult and time-consuming part of a code review is all the reading you have to do.

Mofpofjis

6 days ago

then you are a terribly sloppy reviewer

seadan83

6 days ago

Indeed, instead of writing code to shave a Yak, we're now instead reviewing how the Yak was (most-shittily) shaved.

ugh123

5 days ago

>So we replace the task of writing tedious boilerplate with the task of reading the AI's tedious boilerplate. Which takes just as long. And leaves you with less understanding. And is more boring.

These all sound like your projected assumptions. No, it generally does not take longer to review sizable code changes than it does to write it. This is further alleviated if the code passes tests, either existing or new ones created by the ai.

HDThoreaun

6 days ago

> Which takes just as long.

This has never once been my experience. Its definitely less fun but it takes way less time.

flufluflufluffy

6 days ago

and probably results in a greater net energy consumption/carbon output

davidclark

6 days ago

>If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.

Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis. 6 months ago if I wasn’t using the life-changing, newest LLM at the time, I was also doing it wrong and being a luddite.

It creates a never ending treadmill of boy-who-cried-LLM. Why should I believe anything outlined in the article is transformative now when all the same vague claims about productivity increases were being made about the LLMs from 6 months ago which we now all agree are bad?

I don’t really know what would actually unseat this epistemic prior at this point for me.

In six months, I predict the author will again think the LLM products of 6 month ago (now) were actually not very useful and didn’t live up to the hype.

idlewords

5 days ago

An exponential curve looks locally the same at all points in time. For a very long period of time, computers were always vastly better than they were a year ago, and that wasn't because the computer you'd bought the year before was junk.

Consider that what you're reacting to is a symptom of genuine, rapid progress.

godelski

5 days ago

  > An exponential curve looks locally the same at all points in time

This is true for any curve...

If your curve is continuous, it is locally linear.

There's no use in talking about the curve being locally similar without the context of your window. Without the window you can't differentiate an exponential from a sigmoid from a linear function.

Let's be careful with naive approximations. We don't know which direction things are going and we definitely shouldn't assume "best case scenario"

sfpotter

5 days ago

A curve isn't necessarily locally linear if it's continuous. Take f(x) = |x|, for example.

nickff

5 days ago

There may have been a discontinuity at the beginning of time... but there was nobody there to observe it. More seriously, the parent is saying that it always looks continuous linear when you're observing the last short period of time, whereas the OP (and many others) are constantly implying that there are recent discontinuities.

godelski

5 days ago

I think they read curve and didn't read continuous.

Which ends up making some beautiful irony. One small seemingly trivial point fucked everything up. Even a single word can drastically change everything. The importance of subtlety being my entire point ¯\_(ツ)_/¯

godelski

5 days ago

|x| is piece wise continuous, not absolutely continuous

sfpotter

5 days ago

For a function to be locally linear at a point, it needs to be differentiable at that point... |x| isn't differentiable at 0, so it isn't locally linear at 0... that's the entirety of what I'm saying. :-)

godelski

5 days ago

You're not wrong. But it has nothing to do with what I said. I think you missed an important word...

Btw, my point was all about how nuances make things hard. So ironically, thanks for making my point clearer.

sfpotter

4 days ago

Nothing to do with what you said?

  This is true for any curve...

  If your curve is continuous, it is locally linear.

Hmm...

Sometimes naive approximations are all you've got; and in fact, aren't naive at all. They're just basic. Don't overthink it.

Retr0id

5 days ago

I don't think anyone's contesting that LLMs are better now than they were previously.

Nevermark

5 days ago

> It creates a never ending treadmill of boy-who-cried-LLM.

The crying wolf reference only makes sense as a soft claim that LLM’s better or not, are not getting better in important ways.

Not a view I hold.

Retr0id

5 days ago

The implicit claim is just that they're still not good enough (for whatever the use cases the claimant had in mind)

pera

5 days ago

A flatline also looks locally the same at all points in time.

EA-3167

5 days ago

Nor does local flatness imply direction, the curve could be descending for all that "looks locally flat" matters. It also isn't on the skeptics to disprove that "AI" is a transformative, exponentially growing miracle, it's on the people selling it.

simonw

6 days ago

tptacek wasn't making this argument six months ago.

LLMs get better over time. In doing so they occasionally hit points where things that didn't work start working. "Agentic" coding tools that run commands in a loop hit that point within the past six months.

If your mental model is "people say they got better every six months, therefore I'll never take them seriously because they'll say it again in six months time" you're hurting your own ability to evaluate this (and every other) technology.

cmdli

5 days ago

> tptacek wasn't making this argument six months ago.

Yes, but other smart people were making this argument six months ago. Why should we trust the smart person we don't know now if we (looking back) shouldn't have trusted the smart person before?

Part of evaluating a claim is evaluating the source of the claim. For basically everybody, the source of these claim is always "the AI crowd", because those outside the AI space have no way of telling who is trustworthy and who isn't.

kalkin

5 days ago

If you automatically lump anyone who makes an argument that AI is capable - not even good for the world on net, just useful in some tasks - into "the AI crowd", you will tautologically never hear that argument from anywhere else. But if you've been paying attention to software development discussion online for a few years, you've plausibly heard of tptacek and kentonv, eg, from prior work. If you haven't heard of them in particular, no judgement, but you gotta have someone you can classify as credible independently of their AI take if you want to be able to learn anything at all from other people on the subject.

secalex

5 days ago

Thomas is one of the pickier, crankier, least faddish technologists I've ever met. If he has gone fanboy that holds a lot of weight with me.

simonw

5 days ago

Part of being on Hacker News is learning that there are people in this community - like tptacek - who are worth listening to.

In general, part of being an effective member of human society is getting good at evaluating who you should listen to and who is just hot air. I collect people who I consider to be credible and who have provided me with useful information in the past. If they start spouting junk I quietly drop them from my "pay attention to these people" list.

JohnKemeny

6 days ago

But they say "yes, it didn't work 6 months ago, but it does now", and they say this every month. They're constantly resetting the goal post.

Today it works, it didn't in the past, but it does now. Rinse and repeat.

skwirl

5 days ago

It doesn’t really matter what this or that person said six months ago or what they are saying today. This morning I used cursor to write something in under an hour that previously would have taken me a couple of days. That is what matters to me. I gain nothing from posting about my experience here. I’ve got nothing to sell and nothing to prove.

You write like this is some grand debate you are engaging in and trying to win. But to people on what you see as the other side, there is no debate. The debate is over.

You drag your feet at your own peril.

deadbabe

5 days ago

The thing about people making claims like “An LLM did something for me in an hour that would take me days” is that people conveniently leave out what their own skill level is.

I’ve definitely seen humans do stuff in an hour that takes others days to do. In fact, I see it all the time. And sometimes, I know people who have skills to do stuff very quickly but they choose not to because they’d rather procrastinate and not get pressured to pick up even more work.

And some people waste even more time writing stuff from scratch when libraries exist for whatever they’re trying to do, which could get them up and running quickly.

So really I don’t think these bold claims of LLMs being so much faster than humans hit as hard as some people think they do.

And here’s the thing: unless you’re using the time you save to fill yourself up with even more work, you’re not really making productivity gains, you’re just using an LLM to acquire more free time on the company dime.

skwirl

5 days ago

Again, implicit in this comment is the belief that I am out to or need to convince you of something. You would be the only person who would benefit from that. I don’t gain anything from it. All I get out of this is having insulting comments about my “skill level” posted by someone who knows nothing about me.

deadbabe

5 days ago

You don’t know the harm you’re inflicting. Some manager will read your comment and conclude that anyone who isn’t reducing tasks that previously took hours or days into a brief 1 hour LLM session is underperforming.

In reality, there is a limit to how quickly tasks can be done. Around here, the size of PRs usually have changes that most people could just type out in under 30 minutes if they knew exactly what to type. However, getting to the point where you know exactly what you need to type takes days or even weeks, often collaborating across many teams and thinking deep about potential long term impacts down the road, and balancing company ROI and roadmap objectives, perhaps even running experiments.

You cannot just throw LLMs at those problems and have them wrapped up in an hour. If that’s what you’re doing, you’re not working on big problems, you’re doing basic refactors and small features that don’t require high level skills, where the bottleneck is mostly how fast you can type.

ThrowawayTestr

5 days ago

>And some people waste even more time writing stuff from scratch when libraries exist for whatever they’re trying to do

That's an argument for LLMs.

>you’re just using an LLM to acquire more free time on the company dime.

This is a bad thing?

maleldil

5 days ago

> you’re just using an LLM to acquire more free time on the company dime

You might as well do that since any productivity gains will go to your employer, not you.

astrange

5 days ago

That's not how FAANG compensation works.

maleldil

5 days ago

FAANG engineers are still working class.

astrange

3 days ago

It's hard to be working class on like 4x the median income and stock compensation.

You can't own your own SV home but you can become a slumlord somewhere else remotely.

Sebguer

5 days ago

this is only a compelling counter-argument if you are referring to a single, individual person who is saying this repeatedly. and there probably are! but the author of this article is not that person, and is also speaking to a very specific loop that only first truly became prevalent 6-9 months ago.

conception

5 days ago

I don’t think this is true actually. There was a huge shift of llm coding ability with the release of sonnet 2.5. That was a real shift in how people started using LMS for coding. Before that it was more of a novelty not something people used a lot for real work. As someone who is not a software engineer, as of about November 2024, I “write” hundreds of lines of code a day for meaningful work to get done.

ipaddr

5 days ago

How did you manage before?

conception

5 days ago

The work just wasn’t done. Or it took enough time for me to go and learn how to do it.

jasonfarnon

5 days ago

"they say this every month" But I think the commenter is saying "they" comprises many different people, and they can each honestly say, at different times, "LLMs just started working". I had been loving LLMs for solving NLP since they came out, and playing with them all the time, but in my field I've only found them to improve productivity earlier this year (gemini 2.5).

nsonha

5 days ago

Why focus on the 6 months or however long you think the cycle is. The milestones of AI coding are self-explanatory: autocomplete (shit) -> multi-files edit (useful for simple cases) -> agent (feedback loop with rag & tool use), this is where we are.

Really think about it and ask yourself if it's possible that AI can make any, ANY work a little more efficient?

jasonfarnon

5 days ago

I don't really get this argument. Technology can be improving can't it? You're just saying that people saying it's improving, isn't a great signal. Maybe not, but you still don't conclude that the tech isn't improving, right? If you're old enough, remember the internet was very much hyped. Al gore was involved. But it's probably been every bit as transformative as promised.

prmph

5 days ago

Technology improving is not the issue.

1. LLM fanboy: "LLMs are awesome, they can do x, y, and z really well."

2. LLM skeptic: "OK, but I tried them and found them wanting for doing x, y, and z"

3. LLM fanboy: "You're doing it wrong. Do it this way ..."

4. The LLM skeptic goes to try it that way, still finds it unsatisfactory. A few months pass....

5. LLM fanboy: "Hey, have you tried model a.b.c-new? The problems with doing x, y, and z have now been fixed" (implicitly now agrees that the original complaints were valid)

6. LLM skeptic: "What the heck, I though you denied there were problems with LLMs doing x, y, and z? And I still have problems getting them to do it well"

7. Goto 3

esperent

5 days ago

I stopped paying attention for a few days so I'm way out of date. What is the state of the art for agentic coding now?

I've been using Cline and it can do a few of the things suggested as "agentic", but I'd have no idea how to leave it writing and then running tests in a VM and creating a PR for me to review. Or let it roam around in the file tree and create new files as needed. How does that work? Are there better tools for this? Or do I need to configure Cline in some way?

simonw

5 days ago

tptacek is using Zed, which I've not tried myself.

I actually do most of my "agentic coding" (not a fan of the term, but whatever) in ChatGPT Code Interpreter, which hasn't changed much in two years other than massive upgrades to the model it uses - I run that mainly via o4-mini-high or o3 these days.

OpenAI's Codex is a leading new thing, but only if you pay $200/month for it. Google's equivalent https://jules.google/ is currently free.

GitHub Copilot gained an "agent mode" recently: https://github.blog/ai-and-ml/github-copilot/agent-mode-101-...

There's also Copilot Coding Agent, which is confusingly an entirely different product: https://github.blog/changelog/2025-05-19-github-copilot-codi...

garblegarble

5 days ago

I'd be quite interested in a more formal post with a detailed analysis of the effectiveness of the different agent impls, including Claude Code and Jetbrains Junie.

Do you use ChatGPT Code Interpreter because it's better, or is it just something you're more familiar with and you're sticking with it for convenience?

Of course, I don't know how one would structure a suitable test, since doing it sequentially would likely bias the later agents with clearer descriptions & feedback on the tasks. I imagine familiarity with how to prompt each particular model is also a factor.

simonw

5 days ago

I like Code Interpreter because I'm deeply familiar with it. I don't have to worry about safety at all because it's running in OpenAI's kubernetes container, not on my own laptop. I can control exactly what it can see by selectively uploading files to it. I know it can't make outbound network requests.

nsonha

5 days ago

The current state of agent in the last batch of project launches (copilot agent, jules, devin...) is to take over and do things in a PR like you want. However the vedict is still out there in terms of whether these implementation prove more useful than agentic code in an IDE.

whoisthemachine

5 days ago

Have the models significantly improved, or have we just developed new programs that take better advantage of them?

dcre

5 days ago

Both, but it’s mostly the models. The programs like Claude Code are actually simpler than the ones from before because of this.

jppittma

5 days ago

Both.

simonw

5 days ago

Yeah, definitely both.

New models come out all the time. One of the most interesting signals to look out for is when they tip over the quality boundary from "not useful at task X" to "useful at task X". It happened for coding about a year ago. It happened for search-based research assistants just two months ago, in my opinion - I wrote about that here: https://simonwillison.net/2025/Apr/21/ai-assisted-search/

orionsbelt

5 days ago

At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?

If you ask different people the above question, and if you vary it based on type of task, or which human, you would get different answers. But as time goes on, more and more people would become impressed with what the human can do.

I don't know when LLMs will stop progressing, but all I know is they continue to progress at what is to me a similar astounding rate as to a growing child. For me personally, I never used LLMs for anything, and since o3 and Gemini 2.5 Pro, I use them all the time for all sorts of stuff.

You may be smarter than me and still not impressed, but I'd try the latest models and play around, and if you aren't impressed yet, I'd bet money you will be within 3 years max (likely much earlier).

Velorivox

5 days ago

> At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?

In this context, never. Especially because the parent knows you will always ask 2+2 and can just teach the child to say “four” as their first and only word. You’ll be on to them, too.

x-complexity

5 days ago

> In this context, never. Especially because the parent knows you will always ask 2+2 and can just teach the child to say “four” as their first and only word. You’ll be on to them, too.

On the assumption that you'll always only ask it "what's 2+2?" Keywords being "always" & "you".

In aggregate, the set of questions will continuously expand as a non-zero percentage of people will ask new questions. The set of questions asked will continue to expand, and the LLMs will continue to be trained to fill in the last 20%.

Even under the best interpretations, this is the detractors continuously moving goalposts, because the last 20% will never be filled: New tasks will continuously be found, and critics will point to them as "oh, see, they can't do that". By the time that the LLMs can do those tasks, the goalpost will be moved to a new point and they'll continue to be hypocrites.

------

> > At what point would you be impressed by a human being if you asked it to help you with a task every 6 months from birth until it was 30 years old?

Taking GP's question seriously:

When a task consisting of more than 20 non-decomposable (atomic) sub-tasks is completed above 1 standard deviation of the human average in that given task. (much more likely)

When an advancement is made in a field by that person. (statistically much rarer)

Velorivox

5 days ago

To be clear, I’m just saying the analogy isn’t great, not that one can never be impressed by an LLM (or a person for that matter)!

stouset

5 days ago

I saw this article and thought, now's the time to try again!

Using Claude Sonnet 4, I attempted to add some better configuration to my golang project. An hour later, I was unable to get it to produce a usable configuration, apparently due to a recent v1-to-v2 config format migration. It took less time to hand-edit one based on reading the docs.

I keep getting told that this time agents are ready. Every time I decide to use them they fall flat on their face. Guess I'll try again in six months.

simonw

5 days ago

If you share your conversation (with the share link in Claude) I'd be happy to see if there are any tweaks I can suggest to how you prompted it.

porridgeraisin

5 days ago

Yes.

I made the mistake of procrastinating on one part of a project thinking "Oh, that is easily LLMable". By God, was I proven wrong. Was quite the rush before the deadline.

On the flip side, I'm happy I don't have to write the code for a matplotlib scatterplot for the 10000th time, it mostly gets the variables in the current scope that I intended to plot. But I've really not had that much success on larger tasks.

The "information retrieval" part of the tech is beautiful though. Hallucinations are avoided only if you provide an information bank in the context in my experience. If it needs to use the search tool itself, it's not as good.

Personally, I haven't seen any improvement from the "RLd on math problems" models onward (I don't care for benchmarks). However, I agree that deepseek-r1-zero was a cool result. Pure RL (plain R1 used a few examples) automatically leading to longer responses.

A lot of the improvements suggested in this thread are related to the infra around LLMs such as tool use. These are much more well organised these days with MCP and what not, enabling you to provide it the aforementioned information bank easily. But all of it is built on top of the same fragile next-token generator we know and love.

DarmokJalad1701

5 days ago

> It took less time to hand-edit one based on reading the docs.

You can give it the docs as an "artifact" in a project - this feature has been available for almost one year now.

Or better yet, use the desktop version + a filesystem MCP server pointing to a folder containing your docs. Tell it to look at the docs and refactor as necessary. It is extremely effective at this. It might also work if you just give it a link to the docs.

stouset

5 days ago

The agent reached out to the internet and pulled the golangci-lint docs. Repeatedly. After generating a v1-compatible config I pointed it to the v2 docs. It tried to do the automatic migration but still wound up with incorrect syntax. I asked it to throw away what it had and build a fresh v2-compatible config. It again consulted the docs. Repeat ad nauseam.

I threw in the towel and had a working config in ten minutes.

Yiin

5 days ago

you can add links to docs to llm agents instead of letting them work blindfolded with hardcoded assumptions

maleldil

5 days ago

Claude Code will even request access documentation on its own sometimes. I caught it asking to run a `pydoc` command the other day. I'm not sure if it has access to web search, but it should.

stouset

5 days ago

It reached out to the Internet and pulled the docs. Repeatedly. I even linked to the docs directly to be helpful.

mathgorges

5 days ago

In my experience it's less about the latest generation of LLMs being better, and more about the tooling around them for integration into a programmer's workflow being waaaay better.

The article doesn't explicitly spell it out until several paragraphs later, but I think what your quoted sentence is alluding to is that Cursor, Cline et al can be pretty revolutionary in terms of removing toil from the development process.

Need to perform a gnarly refactor that's easy to describe but difficult to implement because it's spread far and wide across the codebase? Let the LLM handle it and then check its work. Stuck in dependency hell because you updated one package due to a CVE? The LLM can (often) sort that out for you. Heck, did the IDE's refactor tool fail at renaming a function again? LLM.

I'm remain skeptical of LLM-based development insofar as I think the enshitification will inevitably come when the Magic Money Machine breaks down. And I don't think I would hire a programmer that needs LLM assistance in order to program. But it's hard to deny that it has made me a lot more productive. At the current price it's a no-brainer to use it.

tho23j4o3j4324

5 days ago

It's great when it works, but half the time IME it's so stupid that it can't even use the edit/path tools properly even when given line numbers prepended inputs.

(I should know since I've created half-a-dozen tools for this with gptel. Cline hasn't been any better on my codebase.)

karthink

5 days ago

Do Cursor and co have better tools than the ones we write ourselves for lower-level interfaces like gptel? Or do they work better because they add post-processing layers that verify the state of the repo after the tool call?

tho23j4o3j4324

5 days ago

Cursor is proprietary, but is known to index code for doing queries etc.

Cline is closer in spirit to GPTel, but since CLINE is an actual business, it does seem to do well off the bat. That said, I haven't found it to be "hugely better" compared to whatever you can hack in GPTel.

Quite frankly being able to hack the tools on the go in Elisp, makes GPTel far far better (for some of us anyway).

(Thanks for creating GPTel BTW!)

carpo

5 days ago

I think they just meant it hit an inflection point. Some people were copying pasting to ChatGPT and saying it was crap and others were using agents that could see the context of the code and worked much, much better. It's the workflow used not just the specific LLM.

libraryofbabel

5 days ago

This isn't a particularly useful filter, because it applies to many very successful technologies as well. Early automobiles generated a lot of hype and excitement, but they were not very good (unreliable, loud, and dangerous, and generally still worse than horses). They got steadily better until eventually they hit an inflection point where the skeptics were dug in repeating the same increasingly old complaints, while Henry Ford was building the Model T.

anxoo

6 days ago

name 5 tasks which you think current AIs can't do. then go and spend 30 minutes seeing how current AIs can do on them. write it on a sticky note and put it somewhere that you'll see it.

otherwise, yes, you'll continue to be irritated by AI hype, maybe up until the point where our civilization starts going off the rails

TheRoque

5 days ago

Well, I'll try to do a sticky note here:

- they can't be aware of the latest changes in the frameworks I use, and so force me to use older features, sometimes less efficient

- they fail at doing clean DRY practices even though they are supposed to skim through the codebase much faster than me

- they bait me into inexisting apis, or hallucinate solutions or issues

- they cannot properly pick the context and the files to read in a mid-size app

- they suggest to download some random packages, sometimes low quality ones, or unmaintained ones

simonw

5 days ago

"they can't be aware of the latest changes in the frameworks I use, and so force me to use older features, sometimes less efficient"

That's mostly solved by the most recent ones that can run searches. I've had great results from o4-mini for this, since it can search for the latest updates - example here: https://simonwillison.net/2025/Apr/21/ai-assisted-search/#la...

Or for a lot of libraries you can dump the ENTIRE latest version into the prompt - I do this a lot with the Google Gemini 2.5 models since those can handle up to 1m tokens of input.

"they fail at doing clean DRY practices" - tell them to DRY in your prompt.

"they bait me into inexisting apis, or hallucinate solutions or issues" - really not an issue if you're actually testing your code! I wrote about that one here: https://simonwillison.net/2025/Mar/2/hallucinations-in-code/ - and if you're using one of the systems that runs your code for you (as promoted in tptacek's post) it will spot and fix these without you even needing to intervene.

"they cannot properly pick the context and the files to read in a mid-size app" - try Claude Code. It has a whole mechanism dedicated to doing just that, I reverse-engineered it this morning: https://simonwillison.net/2025/Jun/2/claude-trace/

"they suggest to download some random packages, sometimes low quality ones, or unmaintained ones" - yes, they absolutely do that. You need to maintain editorial control over what dependencies you add.

TheRoque

5 days ago

Thanks for the links. You mentioned 2 models in your posts, how should I proceed ? I can't possibly pay 2 subscriptions.. do you have a question for the better one to use ?

simonw

5 days ago

If you're only going to pay one $20/month subscription I think OpenAI wins at the moment - their search tools are better and their voice chat interface is better too.

I personally prefer the Claude models but they don't offer quite as rich a set of extra features.

If you want to save money, consider getting API accounts with them and spending money that way. My combined API bill across OpenAI, Anthropic and Gemini rarely comes to more than about $10/month.

timr

5 days ago

> Or for a lot of libraries you can dump the ENTIRE latest version into the prompt - I do this a lot with the Google Gemini 2.5 models since those can handle up to 1m tokens of input.

See, as someone who is actually receptive to the argument you are making, sometimes you tip your hand and say things that I know are not true. I work with Gemini 2.5 a lot, and while yeah, it theoretically has a large context window, it falls over pretty fast once you get past 2-3 pages of real-world context.

> "they fail at doing clean DRY practices" - tell them to DRY in your prompt.

Likewise here. Simply telling a model to be concise has some effect, to be sure, but it's not a panacea. I tell the latest models do do all sorts of obvious things, only to have them turn around and ignore me completely.

In short, you're exaggerating. I'm not sure why.

simonw

5 days ago

I stand by both things I said. I've found that dumping large volumes of code I to the Gemini 2.5 models works extremely well. They also score very highly on the various needle in a haystack benchmarks.

This wasn't true of the earlier Gemini large context models.

And for DRY: sure, maybe it's not quite as easy as "do DRY". My longer answer is that these things are always a conversation: if it outputs code that you don't like, reply and tell it how to fix it.

timr

5 days ago

Yeah, I'm aware of the benchmarks. Thomas (author of TFA) is also using Gemini 2.5, and his comments are much closer to what I experience:

> For the last month or so, Gemini 2.5 has been my go-to (because it can hold 50-70kloc in its context window). Almost nothing it spits out for me merges without edits.

I realize this isn't the same thing you're claiming, but it's been consistently true for me that the model hallucinates stuff in my own code, which shouldn't be possible, given the context window and the size of the code I'm giving to it.

(I'm also using it for other, harder problems, unrelated to code, and I can tell you factually that the practical context window is much smaller than 2M tokens. Also, of course, a "token" is not a word -- it's more like 1/3 of a word.)

simonw

5 days ago

That's why I said 1m tokens, not 2m tokens. I don't trust them for 2m tokens yet.

travisjungroth

5 days ago

Those aren't tasks.

agotterer

5 days ago

> they can't be aware of the latest changes in the frameworks I use, and so force me to use older features, sometimes less efficient

This is where collaboration comes in play. If you solely rely on the LLM to “vibe code” everything, then you’re right, you get whatever it thinks is best at the time of generation. That could be wrong or outdated.

My workflow is to first provide clear requirements, generally one objective at a time. Sometimes I use an llm to format the requirements for the llm to generate code from. It then writes some code, and I review it. If I notice something is outdated I give it a link to the docs and tell it to update it using X. A few seconds later it’s made the change. I did this just yesterday when building out an integration with an api. Claude wrote the code using a batch endpoint because the steaming endpoint was just released and I don’t think it was aware of it. My role in this collaboration, is to be aware of what’s possible and how I want it to work (e.g.. being aware of the latest features and updates of the frameworks and libraries). Then it’s just about prompting and directing the llm until it works the way I want. When it’s really not working, then I jump in.

bdangubic

5 days ago

they can't be aware of the latest changes in the frameworks I use, and so force me to use older features, sometimes less efficient

of course they can, teach them / feed them latest changes or whatever you need (much like another developer unaware of the same thing)

they fail at doing clean DRY practices even though they are supposed to skim through the codebase much faster than me

tell them it is not DRY until they make it DRY. for some (several projects I’ve been involved with) DRY is generally anti-pattern when taken to extremes (abstraction gone awry etc…). instruct it what you expect and it and watch it deliver (much like you would another developer…)

they bait me into inexisting apis, or hallucinate solutions or issues

tell it when it hallucinates, it’ll correct itself

they cannot properly pick the context and the files to read in a mid-size app

provide it with context (you should always do this anyways)

they suggest to download some random packages, sometimes low quality ones, or unmaintained ones

tell it about it, it will correct itself

esjeon

5 days ago

Anecdotally, ChatGPT still struggles with its own API. It keeps juggling between different versions of its API and hallucinates API parameters, even when I force-feed official documents into the context (to be fair, the documentation is straight awful). Sometimes it totally refuses to change its basic assumptions, so I have to blow up the context just to make it use the up-to-date API correctly.

LLMs are stupid - nothing magic, nothing great. They’re just tools. The problem with the recent LLM craze is that people make too many obviously partially true statements.

simonw

5 days ago

That's because GPT-4o's training cut-off is Sep 30, 2023 (see https://platform.openai.com/docs/models/gpt-4o) and the OpenAI API has changed a LOT since then.

Claude 4 has a training cut-off of March 2025, I tried something today about its own API and it gave me useful code.

apwell23

5 days ago

> tell it when it hallucinates, it’ll correct itself

no it doesn't. Are you serious?

bdangubic

5 days ago

just today 3 times and countless times before… you just gotta take some serious time to learn and understand it… or alternatively write snarky comments on the internet…

soraminazuki

4 days ago

So when LLMs go around in circles, as it often does [1], that's a skill issue. But when it gets it right some of the time, that's proof of superiority.

This is the kind of reasoning that dominates LLM zealotry. No evidence given for extraordinary claims. Just a barrage of dismissals of legitimate problems. Including the article in discussion.

All of this makes me have a hard time taking any of it seriously.

[1]: https://news.ycombinator.com/item?id=44050152

apwell23

5 days ago

intresting. for me it just keeps making up new stuff that doesn't exist when i feed it the error and telling it hallucinates.

perhaps ppl building crud webapps have different experience than ppl building something niche?

bdangubic

5 days ago

I don't build CRUD apps, 1099 contractor so too expensive to do that kind of work :)

apwell23

5 days ago

> - they bait me into inexisting apis, or hallucinate solutions or issues

yes. this happens to me almost every time i use it. I feel like a crazy person reading all the AI hype.

motza

5 days ago

I have definitely noticed these as well. Have you ever tried prompting these issues away? I'm thinking this might be a good list to add to every coding prompt

bradfa

5 days ago

They also can’t hold copyright on their creations.

alisonatwork

5 days ago

The problem with AI hype is not really about whether a particular model can - in the abstract - solve a particular programming problem. The problem with AI hype is that it is selling a future where all software development companies become entirely dependent on closed systems.

All of the state-of-the-art models are online models - you have no choice, you have to pay for a black box subscription service controlled by one of a handful of third-party gatekeepers. What used to be a cost center that was inside your company is now a cost center outside your company, and thus it is a risk to become dependent on it. Perhaps the risk is worthwhile, perhaps not, but the hype is saying that real soon now it will be impossible to not become dependent on these closed systems and still exist as a viable company.

apwell23

5 days ago

> name 5 tasks which you think current AIs can't do.

For coding it seems to back itself into a corner and never recover from it until i "reset" it .

AI can't write software without an expert guiding it. I cannot open a non trivial PR to postgres tonight using AI.

simonw

5 days ago

"AI can't write software without an expert guiding it. I cannot open a non trivial PR to postgres tonight using AI."

100% true, but is that really what it would take for this to be useful today?

poincaredisk

5 days ago

1. create a working (moderately complex) ghidra script without hallucinating.

Granted I was trying to do this 6 months ago, but maybe a miracle has happened. But I'm the past I had very bad experience with using LLMs for niche things (i.e. things that were never mentioned on stackoverflow)

simonw

5 days ago

I've never heard of Ghidra before but, in case you're interested, I ran that prompt through OpenAI's o3 and Anthropic's Claude Opus 4 for you just now (both of them the latest/greatest models from those vendors and new as of less than six months ago) - results here: https://chatgpt.com/share/683e3e38-cfd0-8006-9e49-2aa799dac4... and https://claude.ai/share/7a076ca1-0dee-4b32-9c82-8a5fd3beb967

I have no way of evaluating these myself so they might just be garbage slop.

jcranmer

5 days ago

The first one doesn't seem to actually give me the script, so I can't test it.

The second one didn't work for me without some code modification (specifically, the "count code blocks" didn't work), but the results were... not impressive.

It starts by ignoring every function that begins with "FUN_" on the basis that it's "# Skip compiler-generated functions (optional)". Sorry, but those functions aren't compiler-generated functions, they're functions that lack any symbol names, which in ghidra terms, is pretty damn common if you're reverse engineering unsymbolized code. If anything, it's the opposite of what you would want, because the named functions are the ones I've already looked at and thus give less of a guideline for interesting ones to look into next.

Looking at the results at a project I had open, it's supposed to be skipping external functions, but virtually all the top xrefs are external functions.

Finally, as a "moderately complex" script... it's not a good example. The only thing that approaches that complexity is trying to count basic blocks in a function--something that actually engages with the code model of Ghidra--but that part is broken, and I don't know Ghidra well enough to fix it. Something that would be more along the lines of "moderately complex" to me would be (to use a use case I actually have right now) for example turning the constant into a reference to that offset in the assumed data segment. Or finding all the switch statements that ghidra failed to decompile!

AtlasBarfed

5 days ago

Everyone keeps thinking AI improvement is linear. I don't know if this is correct, but it's just my basic impression that the current AI boost came from instead of being limiting yourself to the CPU and its throughput adding the massive amount of computing power in graphics cards.

But for each nine of reliability you want out of llms everyone's assuming it's just a linear growth. I don't think it is. I think it's polynomial at least.

As for your tasks and maybe it's just cuz I'm using chat GPT, but I asked it to Port sed, something with full open source code availability, tons of examples/test cases, a fully documented user interface and I wanted it moved to Java as a library.

And it failed pretty spectacularly. Yeah it got the very very very basic functionality of sed.

kaydub

5 days ago

Of course it didn't port sed like that. It doesn't matter that it's open source with tons of examples/test cases. It's not going to go read all the code and change it to a different language. It can pick out what sed's purpose is and it built it for you in the language you asked.

chinchilla2020

5 days ago

If AI can do anything, why can't I just prompt "Here is sudo access to my laptop, please do all my work for me, respond to emails, manage my household budget, and manage my meetings".

I've tried everything. I have four AI agents. They still have an accuracy rate of about 50%.

ipaddr

5 days ago

Make me a million dollars

Tell me about this specific person who isn't famous

Create a facebook clone

Recreate Windows including drivers

Create a way to transport matter like in Star Trek.

I'll see you in 6 months.

someothherguyy

5 days ago

Also, professional programmers have varying needs. These people are coding in different languages, with varying complexity, domains, existing code bases and so on.

People making arguments based on sweeping generalizations to a wide audience are often going to be perceived as delusional, as their statements do not apply universally to everyone.

To me, thinking LLMs can code generally because you have success with them and then telling others they are wrong in how they use them is making a gigantic assumptive leap.

spacemadness

5 days ago

I just assume every blog post in HN starts with “As a web dev, TITLE”

dolebirchwood

5 days ago

> Here’s the thing from the skeptic perspective: This statement keeps getting made on a rolling basis.

Dude, just try the things out. It's just undeniable in my day-to-day life that I've been able to rely on Sonnet (first 3.7 and now 4.0) and Gemini 2.5 to absolutely crush code. I've done 3 side projects in the past 6 months that I would have been way too lazy to build without these tools. They work. Never going back.

ryandrake

5 days ago

Why can't reviews of AI be somewhere in the middle between "useless" and "the second coming"?

I tried Copilot a few months ago just to give it a shot and so I could discuss it with at least a shred of experience with the tool, and yea, it's a neat feature. I wouldn't call it a gimmick--it deserves a little more than that, but I didn't exactly cream my pants over it like a lot of people seem to be doing. It's kind of convenient, like a smart autocomplete. Will it fundamentally change how I write software? No way. But it's cool.

dolebirchwood

5 days ago

My anecdotes are all I've got, bro. If they come a little creamy, well, YMMV.

killerstorm

5 days ago

Bullshit. We have absolute numbers, not just vibes.

The top of SWE-bench Verified leaderboard was at around 20% in mid-2024, i.e. AI was failing at most tasks.

Now it's at 70%.

Clearly it's objectively better at tackling typical development tasks.

And it's not like it went from 2% to 7%.

lexandstuff

5 days ago

Isn't SWE-bench based on public Github issues? Wouldn't the increase in performance also be explained by continuing to train on newer scraped Github data, aka training on the test set?

The pressure for AI companies to release a new SOTA model is real, as the technology rapidly become commoditised. I think people have good reason to be skeptical of these benchmark results.

killerstorm

5 days ago

That sounds like a conspiracy theory. If it was just some mysterious benchmark and nothing else then sure, you have reasons to be skeptical.

But there's a plenty of people who actually tried LLMs for actual work and swear they work now. Do you think they are all lying?..

Many people with good reputation, not just noobs.

slg

6 days ago

>We imagine artists spending their working hours pushing the limits of expression. But the median artist isn’t producing gallery pieces. They produce on brief: turning out competent illustrations and compositions for magazine covers, museum displays, motion graphics, and game assets.

One of the more eye-opening aspects of this technology is finding out how many of my peers seemingly have no understanding or respect for the concept of art.

simonw

6 days ago

How do you mean?

slg

6 days ago

Whole libraries have been written over millennia about the importance and purpose of art, and that specific quote reduced it all down to nothing more than the creation of a product with a specific and mundane function as part of some other product. I genuinely feel bad for people with that mindset towards art.

jowea

5 days ago

I think that quote is talking about commercial art and there being a market willing to pay a large number of artists to do relatively mundane artworks. It does not exclude the possibility of artists doing art for art's sake as a hobby or a few elite artists doing paid high culture art. It's like when photography became a thing and there was a lot less paid work available to painters.

slg

5 days ago

I don't see that as a distinction worth making. Commercial art is still art.

Music, for example, is an incredibly commercialized art. Replacing every song or album I have ever purchased with AI generated facsimiles is also an incredibly depressing thought. And let me tell you, my tastes aren't "a few elite artists doing paid high culture art".

I would hope people still find value in painting, especially in a world with photography. That is even ignoring the strained nature of this analogy. The context of the original quote was in a discussion of the inherent plagiarism of AI. Photography wasn't invited by stealing painters work.

throwaway743

5 days ago

There will be people who see something created by AI, are told it was made by a human, and have an emotional reaction as if it were. That alone challenges the idea that emotional impact depends on human authorship.

Does knowing a human made something automatically make it more valuable? Should it? Shouldn't the work speak for itself, rather than rely on the cult of personality around its creator?

These discussions always seem to focus on form as if that is what defines art. But in many cases concept is more important. Duchamp didn't craft the urinal. The idea was the art. If a piece moves someone, and that reaction changes based on who or what made it, what does that really say about how we judge art?

jowea

5 days ago

I mean, context changing how we see art seems natural enough. Would anyone care about analyzing this painting[1] if wasn't created by Hitler? Would anyone care about this child's drawings[2] if it wasn't 800 years old? Would people care about a stained glass depiction of the Crucifixion of Jesus if it wasn't a central event in Christianity?

Personally, I think in a lot of cases, people want to feel some sort of emotional connection with the artist through their creation, which doesn't work if you know it's AI created.

[1] https://commons.wikimedia.org/wiki/File:Adolf_Hitler_Der_Alt...

[2] https://en.wikipedia.org/wiki/Onfim

jowea

5 days ago

The big distinction is that the cheaper AIs will crowd out the humans out of the market, so mass market commercial art will be made by AIs if it is possible to produce that art. But some people will still want non-AI art, which I believe will be focused on less commercial focused art sectors.

> Music, for example, is an incredibly commercialized art. Replacing every song or album I have ever purchased with AI generated facsimiles is also an incredibly depressing thought.

And just to be clear, I'm not saying you're wrong.

> I would hope people still find value in painting, especially in a world with photography.

Sure, people do, but it is now a hobby for some and high art for a smaller number of professional painters, but the market willing to sustain a large number of professional painters doing portraits is gone.

> That is even ignoring the strained nature of this analogy. The context of the original quote was in a discussion of the inherent plagiarism of AI. Photography wasn't invited by stealing painters work.

I think the analogy is relevant because I am discussing the plagiarism of AI in relation to the economic aspects of copyright infringement and the impacts on the market for artists and SW devs. Not in relation to the moral rights[1] of authors. The issue of artists being annoyed on principle, not on economic effects, that some souless computer is producing plagiarist art that imitates their artstyle without attribution is a separate but closely related issue. I'm not sure but I think the article is more concerned with the former issue.

[1] https://en.wikipedia.org/wiki/Moral_rights

slg

5 days ago

>I think the analogy is relevant because I am discussing the plagiarism of AI in relation to the economic aspects of copyright infringement and the impacts on the market for artists and SW devs. Not in relation to the moral rights[1] of authors. The issue of artists being annoyed on principle, not on economic effects, that some souless computer is producing plagiarist art that imitates their artstyle without attribution is a separate but closely related issue. I'm not sure but I think the article is more concerned with the former issue.

How can you justify separating the two concerns? This article is a defense of AI against its critics. It is a pretty poor defense if the argument is along the lines of "certain ethical concerns don't count". The author being "more concerned with" one issue doesn't make the other issue invalid or irrelevant.

jdminhbg

4 days ago

> Replacing every song or album I have ever purchased with AI generated facsimiles

The error here is that the quote from the article says "the median artist" and you have never purchased a song or album by a median musician.

zemo

5 days ago

art to them is a commodity, not a facet of expression or cultural participation.

wiseowise

5 days ago

Nobody talks about art that survived millennia.

mordae

5 days ago

https://en.wikipedia.org/wiki/Beowulf_(2007_film)

debugnik

5 days ago

Never been to an archeological museum? There's a couple just in my city, they definitely show art that survived millennia.

wiseowise

5 days ago

Not sure what you’re talking about specifically.

My point is that nobody argues that AI replaces art that survived millennia.

debugnik

5 days ago

Neither did the comment you first replied to, so I probably read your reply differently than you intended.

jryle70

5 days ago

Huh? just because most of my day spent writing simple scripts for cleaning up data, to use as an example, doesn't mean I didn't have many many years honing my programming skills.

Someone who points out I'm mostly doing mundane things is telling the truth, not that they underestimate my knowledge or skills or don't appreciate the beauty of code.

slg

5 days ago

I don't know how you interpreted my comment about art being undervalued as me undervaluing coding, but it is the wrong interpretation of what I said.

ckiely

6 days ago

The argument that programmers are into piracy and therefore should shut up about theft is nonsensical. Not defending piracy, but at least an artist or creator is still credited and their work is unadulterated. Piracy != plagiarism.

grose

6 days ago

It's also ignoring the fact that much plagiarized code is already under permissive licenses. If Star Wars or Daft Punk were CC-BY-SA nobody would need to pirate them, and there may even be a vibrant remix culture... which is kind of the whole point of open source, is it not?

mkfs

5 days ago

These LLMs don't respect those permissive licenses, though. Especially the GPL, but even MIT requires attribution through inclusion of a copyright notice.

grose

2 days ago

Yes, exactly, that is my point.

nickpsecurity

5 days ago

I'll add it's not true for programmers with morality. For instance, people who follow Jesus Christ are commanded to obey the law and treat people right. Many think respecting copyright is both. While I oppose copyright law, I do uphold it and like supporting content creators.

Also, I think there's an argument similar to cryptocurrency companies that run like pyramid schemes. I could've made easy money doing security work for them. Yet, I felt like I'd be participating in helping them rob people or advancing their con. (Some jobs, like building assurance tools, might be OK.) Likewise, using tools built on massive, copyright infringement might be supporting or promoting that.

So, I gotta use legally-trained models or wait for legal reforms that make LLM training legal. Especially the data sets they distribute which is currently illegal, file sharing.

TheCraiggers

6 days ago

> "For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields."

You've already lost me, because I view programming as an art form. I would no more use AI to generate code than I would use it to paint my canvas.

I think the rest of the article is informative. It made me want to try some things. But it's written from the perspective of a CEO thinking all his developers are just salt miners; miners go into the cave and code comes out.

I think that's actually what my hangup is. It's the old adage of programmers simply "copying and pasting from stack overflow" but taken to the extreme. It's the reduction of my art into mindless labor.

ACCount36

6 days ago

People don't pay programmers to produce great art. No one sees that "art" and no one cares. They pay programmers to get shit done.

skydhash

6 days ago

A functional code that is easy to maintain is art (but you have to be an experienced programmer to see it). A shoddy project isn’t, but the whole company feels the pain.

Mofpofjis

5 days ago

Wrong.

I do pay my general contractor for the unseen quality that goes into the structure of my home. Programs should be built the same way.

tptacek

6 days ago

Woodworking is also an art form. But most people just need furniture, fixtures, and structures. Nobody would take seriously the idea that new construction all be done with sashimono joinery in order to preserve the art form, but somehow we're meant to take seriously the idea of hand-dovetailed CRUD apps.

bradly

5 days ago

> Woodworking is also an art form. But most people just need furniture, fixtures, and structures. Nobody would take seriously the idea that new construction all be done with sashimono joinery in order to preserve the art form, but somehow we're meant to take seriously the idea of hand-dovetailed CRUD apps.

How many furniture makers did you talk to forming this opinion? The metaphor does not line up with either my software of furniture experience. I work with production furniture shops that choose not to use CNCs to avoid the soul being sucked out of the work. This is not a rare stance to take and this is not "japanese joinery" woodworking. This is real work, balancing the means of production with optimal quality. There is all sorts of arguments on whether cncs or using a domino or whatever is "real" woodworking, but the idea that this choice of quality does not exist in woodworking and so we shouldn't have it in software is not my experience.

pvg

5 days ago

You don't need to talk to furniture makers to know that mass produced furniture has replaced cabinetmakery almost completely. Most of us are sitting on the evidence.

bradly

5 days ago

It is not about being mass produced or not–it is this reoccurring theme by people who do not spend their days writing code saying that mediocre code is good enough. It is not for me. Code decays. Mediocre code today is bade code tomorrow. Not everyone shares this pov–totally fine–but the tone of the article is tough.

tptacek

5 days ago

I'm a professional software developer and I'm saying mediocre code is often good enough. Mediocre code does not in fact decay. The belief that every line of code you write on every problem definition has to be crystalline and formally rigorous is an early-career thing you get over. Fretting over literally every line of code is a way of doing your job badly. The skill is in knowing when you need really interesting (and "interesting" is the distinction here) code, and when you should just whip our your shop jig and repeat the exact same pocket hole joint you've been using for the last 10 years.

I don't want to step on the cabinetry discussion, just, I think it's important to call out this idea that there's a universal quality/intricacy/interestingness threshold for all production software. That was a common fallacy long before LLMs (you used to see a lot of it in Rails culture with people ceaselessly grooming unit test suites and designing ever-more-seamless mocking systems). Part of growing in this profession is getting a sense for when extra exertion is needed, and when it's wasted.

bradly

5 days ago

> Mediocre code does not in fact decay.

That's okay. I think all code does. Some much faster than others. I could be wrong, but I think I'm sticking to this opinion (for now at least)

> The belief that every line of code you write on every problem definition has to be crystalline and formally rigorous is an early-career thing you get over. Fretting over literally every line of code is a way of doing your job badly.

I agree. I do not think this is a debated topic. That being said, I don't see writing quality software to be a waste of time the same I don't see a quality weld. It needs to be relible to what the stakeholders have defined as adaquate.

> The skill is in knowing when you need really interesting (and "interesting" is the distinction here) code, and when you should just whip our your shop jig and repeat the exact same pocket hole joint you've been using for the last 10 years.

Ahh alright so this is where is gets interesting! I think you are close, but not "whip out". As a woodworking you need to be able to make that jig! The is real knowledge/wealth.

I'm still under the belief great devs will use the tools at their disposal as they see fit to write high quality software. Poor devs will write poor quality software with the tools at their disposal. I don't think any of this changes with AI.

I think where I struggle is even as someone who use AI to help write code everyday and would have a hard time going back, these articles do not sit with me.

tptacek

5 days ago

Sure, you need to be able to make the jig. But what does that have to do with anything? Once you do, you have it. Is your point that LLMs are problematic for less-skilled programmers, who need the rote tasks to build up fluency? That's probably true! It doesn't change anything for skilled programmers, though.

bradly

5 days ago

> Is your point that LLMs are problematic for less-skilled programmers, who need the rote tasks to build up fluency?

Yes, I think less-skilled programmers are being bombarded with messages that in the future of coding being less-skilled is adequate. What I've seen in today's layoff culture is not something that gives me hope for tomorrows junior developers.

I think there is huge and vast gap between: "mediocre software is good enough" and "as a highly skilled dev I understand when I need to be clever or not". The latter has been true for ages and the former is the feeling AI pushers are pushing.

I think the term "mediocre" represents both poor quality code and boring code and I think this word is chosen because of this ambiguity. I am absolutely onboard with boring code and I'm definitely not okay with mediocre code if it is not reliable, maintainable, and/or readable per the stakeholders.

> It doesn't change anything for skilled programmers, though.

Exactly! Its just another tool. A great one, sure, but we were great devs before AI and we will be great devs after.

These are hard conversations to be have via message boards for sure. Thanks for the time. Would love to grab coffee or a drink and talk more.

pvg

5 days ago

Right but just like most people were looking for a cheap shirt, pot or chair, most people are looking for 'cheaper way to make computers do what I want'. These changes didn't happen because people hated art or failed to inquire with weavers, potters and joiners.

bradly

5 days ago

But woodworkers still woodwork just fine with mass production. Just there are mass produced things. And there are not. So then there can be mass-produced code and there can not be. The article feel all or none and I do not think that translate to woodworking or software.

pvg

5 days ago

I don't think the article says that. People still write games for the NES in assembly but that's effectively unrelated to current software development practice.

bradly

5 days ago

Again, this comparison that professional woodworking is anything like NES game building is just so far from my experience. Cabinet makers (and to a less extent furniture makers) are numerous and very successful.

Mass production doesn't have to eliminate the alternative to exist. The same way fast-food and cheap groceries are not a threat to quality restaurants.

pvg

5 days ago

This 'only one can exist' is something you brought in, though. Neither the article nor I am saying that.

layer8

6 days ago

I don't think that analogy matches very well. Most software is bespoke, the domain requirements, usage aspects, and architectural trade-offs are subtly, or often non-subtly, different each time, and take different trajectories over time. It's not like you're producing the same software 10,000 times, like a piece of furniture. And AI isn't able to produce the exact same thing reproducibly anyway. A better argument would be that AI is actually approaching the craftsmanship/artisanal capabilities.

sneak

6 days ago

Most line of business apps and business logic are only bespoke and custom insofar as field names and relations and what APIs they trigger on which events.

Just because software is “bespoke” doesn’t mean it’s complicated or special.

layer8

6 days ago

> Most line of business apps and business logic are only bespoke and custom insofar as field names and relations and what APIs they trigger on which events.

That's not my experience. Of course, everything is just a finite state machine operating on a memory tape.

kiitos

4 days ago

I'm not sure why you're equivocating "all code" with "CRUD apps". If that were the case I'd be with you. But it isn't. Glue code between APIs is indeed pointless garbage that can be automated, but that's not what I, nor many other engineers, are writing day-to-day.

drbojingle

6 days ago

That's cause there's an element of mindless labour to it. It's easier to spot that so it gets more focus.

pydry

6 days ago

If you find that theres an element of mindless labor to coding then you're probably doing it wrong.

drbojingle

5 days ago

Get a key logger and track the repeated keys you hit. Arrow keys, hjkl or w/em just the keys that take you to the place you need to be.

While your at it you can tell me about your favorite click.

pydry

5 days ago

Ok. My favorite letter is apparently E.

Does that mean ive never written anything creative before?

drbojingle

5 days ago

No but it means your missing my point. Do you click those keys for their own sake, or are you doing it for some other reason?

Personally I don't code because I love the letter E. I'd prefer if they all appeared on the screen without me doing any typing at all. It's a means to an end.

What one person enjoy might be toil to another.

pydry

4 days ago

Im not doing it for its own sake but it's also a tiny fraction of the effort of coding. I spend thr vast majority of my effort thinking, very little on typing.

deanCommie

6 days ago

I'm sure salt miners needed to make peace with their toil and also focused on tools and techniques to be more productive; how to remove the salt most elegantly in nice clean blocks, minimize waste, reduce burden on their physical bodies.

But to their bosses their output was salt.

I'm sorry but unless you're working in open source for the pure love of the tech/craft, the output of software engineering is PROBLEM SOLVING.

That's why "build vs. buy" exists - sometimes it's better to buy a solution than buy one. That's why a valid solution to a problem sometimes is to convince a customer that their ask is wrong or unreasonable, and something simpler or easier would get them 99% of what they need with 1% of the effort.

That's our job.

soraminazuki

4 days ago

Sure, for a large portion of our industry, the goal is to hoover up as much user data as cheaply as possible. Being responsible with that data isn't part of that "PROBLEM SOLVING."

lapcat

6 days ago

> If you were trying and failing to use an LLM for code 6 months ago, you’re not doing what most serious LLM-assisted coders are doing.

This sounds like the "No true Scotsman" fallacy.

> People coding with LLMs today use agents. Agents get to poke around your codebase on their own.

That's a nonstarter for closed source, unless everything is running on-device, which I don't think it is?

> Part of being a senior developer is making less-able coders productive

Speak for yourself. It's not my job.

haiku2077

5 days ago

You can run the agents on your own infrastructure (all the way down to a Mac Mini sitting on your desk), or Microsoft, OpenAI and I'm pretty sure Anthropic can sell you an Enterprise service that guarantees a certain level of confidentiality. I work in aerospace, one of the most paranoid industries, and even we got a Copilot subscription that met our needs...

lapcat

5 days ago

> You can run the agents on your own infrastructure (all the way down to a Mac Mini sitting on your desk)

How does that work exactly? Do you have a link?

> Microsoft, OpenAI and I'm pretty sure Anthropic can sell you an Enterprise service that guarantees a certain level of confidentiality

These companies hoovered up all of our content without notice, permission, or compensation, to train their models. I wouldn't trust them one bit. My personal opinion is that it's foolish to trust them.

> I work in aerospace, one of the most paranoid industries

Paranoid about what exactly?

simonw

5 days ago

If you're not going to trust them when they say "here is a contract that guarantees we won't train on your data" because they trained on a scrape of the web, you're never going to get the benefit from these tools. I guess that's your call. I chose to believe companies when they contractually oblige themselves to not do things.

blibble

5 days ago

Microsoft and Google are of course both famously known for studiously obeying contracts, the law, and not stabbing their partners in the back when it goes against their monetary interests

haiku2077

5 days ago

> How does that work exactly? Do you have a link?

https://ollama.com lets you run models on your own hardware and serve them over a network. The you point your editor at that server, eg https://zed.dev/docs/ai/configuration#ollama

DiabloD3

5 days ago

Don't use Ollama, use llama.cpp instead.

moomin

5 days ago

Who are these friends? Are they in the room with us right now? Look, maybe my experience is atypical but I’m an AI skeptic and I know plenty of others. I’ve never heard people claim that LLMs are a fad or going to go away.

I’ve seen lots of people:

* think that conflating LLMs and “AI” produces a lot of poorly reasoned arguments

* doubt the economic narratives being built around LLM technology

* think the current rate of progress in the technology is basically flat

* think most “AI companies” resemble most crypto companies

An addendum to the last point: very few crypto skeptics deny that BitCoin is a thing or think it’s going away, either. It’s just strawmanning.

tptacek

5 days ago

These are 4 arguments that have literally nothing to do with what I wrote.

* I don't care what "AI" is. There's a section in the post about this. Nobody needs to update their religious beliefs because of LLMs.

* I don't care about the economic futures of Nvidia or the frontier model companies. I'm not recommending you invest in their stock.

* The article opens with "all progress in LLMs could halt and this would still apply".

* The article doesn't care about "AI companies".

I'm a crypto skeptic and I do deny Bitcoin is a long-term thing, for what it's worth. I'm a hard skeptic of cryptofinance.

moomin

5 days ago

As I say, those are the arguments that I actually hear people make. And most of the arguments I'm hearing people make are from people who are using LLM tech on a daily basis (not the agent stuff so much, although it sounds promising). I remain surprised you're encountering so many people quite so skeptical.

The only real dispute about the usefulness of the current technology I hear is the concern that this will leave us without devs who understand the code at a deep level.

pvg

5 days ago

Maybe it would have been clearer without the dunktrope at the start of the comment.

eleveriven

5 days ago

Like you said, nobody reasonable is denying that something real is happening - the question is more about how useful, how soon, and for whom

cpach

5 days ago

Not all hype is just hot air.

Case in point: WWW and the “dot com” bubble. Dot com was a bubble, but the WWW (and Internet) was a big thing that transformed all major societies. It survived the bust that followed after the boom.

Are LLMs here to stay? We can guess, but as usual, time will tell.

esjeon

5 days ago

It's like the problem of half-full vs half-empty. We see LLM can handle certain tasks, but counter-examples are too far from being rare. So, "LLM can do A" is always followed by "LLM fails at A".

The problem is that LLM never performs consistently. It works when it works. It doesn't when It doesn't. No one knows exactly why, and no one can tell when it's gonna fail. For example, even to this day, GPT sometimes gives me wrong calculations, even when it is instructed to use calculator for that. Who knows why it ignores the instruction, nor why it can't reliably perform the addition of two integers. That really screws up with the automation.

Anyways, I'm really tired of skeptic-skeptics. I hate some people believe "half-full" is genuinely better than "half-empty". I refuse that idea completely. It's more about which context you're in. If you need exactly a cup of water, it's half-empty. If you are lucky to have some water, it's half-full. If you have a clear direction you want to go, you discover a lot of deficiencies. If you're there just enjoying the moment, yeah, you can just keep enjoying the new toy.

grafmax

5 days ago

> Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades. We’re not East Coast dockworkers; we won’t stop progress on our own.

If you really believe in the power of LLMs then it’s time to wake up. The writing is on the wall. Automation the workforce further into precarious jobs.

The idea that tech workers can’t stop so-called “progress” is at best a dumb self-fulfilling prophecy. Our workplaces depend on us. We have the power to put the brakes on whatever our employers are up to by organizing and striking.

Tech workers should be organizing to prepare for the profit-taking moves management has in store for us as the tech gets better and better. If LLMs really live up to their potential, It’s just going to get worse from here.

They keep bragging about how many people are going to lose their jobs and they mean us. They’re bragging about firing us! It’s foolish for us to sit idly by while we are the ones who make them their profits.

Aeolun

5 days ago

> We have the power to put the brakes on whatever our employers are up to by organizing and striking.

You have the power to do that in your own isolated environment, but it's a delaying tactic at best.

This whole thing is as inevitable as TV, smartphones, airplanes, tanks or guns. Everyone will use them, because they're so much better than anything that came before that there's no competition for the same use cases.

grafmax

5 days ago

The point isn’t to delay the tech from arriving but to have better negotiating power when it fully arrives.

swalsh

5 days ago

> Tech workers should be organizing to prepare for the profit-taking moves management has in store for us

I think you think this is going to help tech workers, but raising the cost of employing humans is only going to incentivize companies to adopt AI faster.

You should do the opposite, and tax AI use. Though that's probably rife with potential issues too. I think as umemployment increases we're going to have to make a bad decision somewhere... and I'm guessing taxing AI is the lesser evil.

grafmax

5 days ago

Historically many unions were formed during times of mass automation. Unions don’t unilaterally drive up wages. They increase workers’ negotiating power. This is needed to help workers best navigate the times ahead.

The time for tough decisions is today. There is no interest from the companies that control our politicians in supporting taxes, basic income, or whatever other policy proposal people think will occur after AI. Worker leverage will only be diminished after mass layoffs. We will end up in a situation of social unrest. To pull out of that workers will be forced to organize to increase negotiating power. Sooner is better than later, if we can realize it’s in our interests to do so.

dartharva

5 days ago

This is the cost of getting used to outrageously high salaries as compared to other engineering fields. Market forces allowed you to inflate your income beyond reason, now the same forces are taking it away. Turns out decoupling your pay from value delivered wasn't such a good idea after all.

grafmax

5 days ago

The fact that companies pay these prices shows the money is there. The money is there because of the value these workers create.

dartharva

5 days ago

.. and now the money won't be there because the value has fallen in the era of AI code generators.

grafmax

5 days ago

If the businesses are profiting the money is there. Organizing gives workers leverage to gain a larger portion of the profits they are generating for their employers.

dyauspitr

5 days ago

If the tech is there dragging your feet is not going to buy you too much time.

wvenable

5 days ago

If work can meaningfully be done by a machine, why should we be fighting against that? For jobs? If life is all about doing pointless work a machine can do, we've got problems as a species.

The problem is capitalism not LLMs. Fighting against this progress just leaves us stagnant as a species.

Perhaps AI will lead to the revolution where automation frees the average person from having to toil in a factory or a cubical farm instead of trying to achieve infinite growth in a closed system.

thoughtpeddler

5 days ago

I was recently "pair-vibe-coding" with someone who's much 'smarter' than I am, certainly when it comes to coding, especially Python. He's always been in the LLM skeptic camp, and it was fascinating to see that because of his extensive Python knowledge, his prompting was actually very minimal and weak, one might even say 'lazy'. The output he got from o3 for our given task was therefore really mediocre, and had a few hallucinations (which could've been avoided if he spent a few more seconds or minutes prompting).

I, knowing far less than him, would've had a much more elaborate prompt, and o3 would've proved a lot more competent/capable. Yet with my friend, since he knows so much already, and has such a high bar, he thinks the AI should be able to do a lot more with just a few basic words in a prompt... yet, for that same reason, he (understandably) doubts the inevitable sub-par output.

That's what makes all these debates about "Why are smart people doubting LLMs??" so pointless. The smarter you are, the less help you need, so the less prompting you do, the less context the model has, the less impressive the output, and the more the smart person thinks LLMs suck. With this logic, of course the smartest people are also the biggest skeptics!

sho_hn

5 days ago

I doubt this holds true generally. The smart coders I know who are also LLM users generally develop a decent intuition for what the models are good and bad at, and how to steer them into good performance.

thoughtpeddler

5 days ago

Then perhaps my friend has remained a skeptic for so long that he's atrophied in this regard (which OP's post touches on). Either way, most of his day job is as a CTO/manager at a startup, so he's not in the weeds coding as much anymore in the first place. I should've seen how he prompts LLMs for managerial tasks, then I'd know whether his 'prompt laziness' was systemic or influenced by his coding knowledge.

dawnbreez

4 days ago

Throwing my two cents in here...I think there's a disconnect between what AI advocates want, and what everyone else wants.

The arguments against genAI tend to point out things like: 1. Its output is unreliable at best 2. That output often looks correct to an untrained eye and requires expert intervention to catch serious mistakes 3. The process automates away a task that many people rely on for income

And the response from genAI advocates tends to be dismissive...and I suspect it is, in part, because that last point is a positive for many advocates of genAI. Nobody wants to say it out loud, but when someone on Reddit or similar claims that even a 10% success rate outweighs the 90% failure rate, what they mean is most likely "A machine that works 10% of the time is better than a programmer who works 60-80% of the time because the machine is more than 6-to-8-times cheaper than the programmer".

There's also the classic line about how automation tends to create more jobs in the future than it destroys now, which itself is a source of big disconnects between pro-genAI and anti-genAI crowds--because it ignores a glaring issue: Just because there's gonna be more jobs in the future, doesn't mean I can pay rent with no job tomorrow!

"You can write an effective coding agent in a week" doesn't reassure people because it doesn't address their concerns. You can't persuade someone that genAI isn't a problem by arguing that you can easily deploy it, because part of the concern is that you can easily deploy it. Also, "you’re not doing what the AI boosters are doing" is flat-out incorrect, at least if you're looking at the same AI boosters I am--most of the people I've seen who claim to be using generated code say they're doing it with Claude, which--to my knowledge--is just an LLM, albeit a particularly advanced one. I won't pretend this is anything but anecdata, but I do engage with people who aren't in the "genAI is evil" camp, and...they use Claude for their programming assistance.

"LLMs can write a large fraction of all the tedious code you’ll ever need to write" further reinforces this disconnect. This is exactly why people think this tech is a problem.

The entire section on "But you have no idea what the code is!" falls apart the moment you consider real-world cases, such as [CVE-2025-4143](https://nvd.nist.gov/vuln/detail/cve-2025-4143), where a programmer who is a self-described expert working with Claude--who emphasizes that he checked over the results with a fine-toothed comb, and that he did this to validate his own skepticism about genAI!--missed a fundamental mistake in implementing OAuth that has been common knowledge for a long while. The author is correct in that reading other people's code is part of the job...but this is difficult enough when the thing that wrote the code can be asked about its methods, and despite advances in giving LLMs a sort of train of thought, the fact remains that LLMs are designed to output things that "look truth-y", not things that are logically consistent. (Ah, but we're not talking about LLMs, even though kentonv tells us that he just used an LLM. We're talking about agentic systems. No true AI booster would "just" use an LLM...)

I actually agree with the point about how the language can catch and point out some of the errors caused by hallucination, but...I can generate bad function signatures just fine on my own, thank you! :P In all seriousness, this addresses basically nothing about the actual point. The problem with hallucination in a setting like this isn't "the AI comes up with a function that doesn't exist", that's what I'm doing when I write code. The problem with hallucination is that sometimes that function which doesn't exist is my RSA implementation, and the AI 'helpfully' writes an RSA implementation for me, a thing that you should never fucking do because cryptography is an incredibly complex thing that's easy to fuck up and hard to audit, and you really ought to just use a library...a thing you [also shouldn't leave up to your AI.](https://www.theregister.com/2025/04/12/ai_code_suggestions_s...) You can't fix that with a language feature, aside from having a really good cryptography library built into the language itself, and as much as I'd love to have a library for literally everything I might want to do in a language...that's not really feasible.

"Does an intern cost $20/month? Because that’s what Cursor.ai costs," says the blog author, as if that's supposed to reassure me. I'm an intern. My primary job responsibility is getting better at programming so I can help with the more advanced things my employer is working on (for the record, these thoughts are my own and not those of my employer). It does not make me happy to know that Cursor.ai can replace me. This also doesn't address the problem that, frankly, large corporations aren't going to replace junior developers with these tools; they're going to replace senior developers, because senior developers cost more. Does a senior engineer cost 20 dollars a month? Because that's what Cursor.ai costs!

...and the claim that open source is just as responsible for taking jobs is baffling. "We used to pay good money for databases" is not an epic own, it is a whole other fucking problem. The people working on FOSS software are in fact very frustrated with the way large corporations use their tools without donating so much as a single red cent! This is a serious problem! You know that XKCD about the whole internet being held up by a project maintained by a single person in his free time? That's what you're complaining about! And that guy would love to be paid to write code that someone can actually fucking audit, but nobody will pay him for it, and instead of recognizing that the guy ought to be supported, you argue that this is proof that nobody else deserves to be supported. I'm trying to steelman this blogpost, I really am, but dude, you fundamentally have this point backwards.

I hope this helps others understand why this blogpost doesn't actually address any of my concerns, or the concerns of other people I know. That's kind of the best I can hope for here.

kentonv

4 days ago

> 1. Its output is unreliable at best

> 2. That output often looks correct to an untrained eye and requires expert intervention to catch serious mistakes

The thing is this is true of humans too.

I review a lot of human code. I could easily imagine a junior engineer creating CVE-2025-4143. I've seen worse.

Would that bug have happened if I had written the code myself? Not sure, I'd like to think "no", but the point is moot anyway: I would not have personally been the one to write that code by hand. It likely would have gone to someone more junior on the team, and I would have reviewed their code, and I might have forgotten to check for this all the same.

In short, whether it's humans or AI writing the code, it was my job to have reviewed the code carefully, and unfortunately I missed here. That's really entirely on me. (It's particularly frustrating for me as this particular bug was on my list of things to check for and somehow I didn't.)

> 3. The process automates away a task that many people rely on for income

At Cloudflare, at least, we always have 10x more stuff we want to work on then we have engineers to work on it. The number of engineers we can hire is basically dictated by revenue. If each engineer is more productive, though, then we can ship features faster, which hopefully leads to revenue growing faster. Which means we hire more engineers.

I realize this is not going to be true everywhere, but in my particular case, I'm confident saying that my use of AI did not cause any loss of income for human engineers, and likely actually increased it.

dawnbreez

3 days ago

I mean, fair. It's true that humans aren't that great at writing code that can't be exploited, and the blogpost makes this point too: between a junior engineer's output and an LLM's output, the LLM does the same thing for cheaper.

I would argue that a junior engineer has a more valuable feature--the ability to ask that junior engineer questions after the fact, and ideally the ability to learn and eventually become a senior engineer--but if you're looking at just the cost of a junior engineer doing junior engineer things...yeah, no, the LLM does it more efficiently. If you assume that the goal is to write code cheaper, LLMs win.

However, I'd like to point out--again--that this isn't going to be used to replace junior engineers, it's going to be used to replace senior engineers. Senior engineers cost more than junior engineers; if you want each engineer to be more productive per-dollar (and assume, like many shareholders do, that software engineers are fungible) then the smart thing to do is replace the more costly engineer. After all, the whole point of AI is to be smart enough to automate things, right?

You and I understand that a senior engineer's job is very different from a junior engineer's job, but a stockholder doesn't--because a stockholder only needs to know how finance works to be a successful stockholder. Furthermore, the stockholder's goal is simply to make as much money as possible per quarter--partly because he can just walk out if the company starts going under, often with a bigger "severance package" than any of the engineers in the company. The incentives are lined up not only for the stockholder to not know why getting rid of senior engineers is a bad idea, but to not care. Were I in your position, I would be worried about losing my job, not because I didn't catch the issue, but because

Aside: Honestly, I don't really blame you for getting caught out by that bug. I'm by no means an expert on anything to do with OAuth, but it looks like the kind of thing that's a nightmare to catch, because it's misbehavior under the kind of conditions that are--well, only seen when maliciously crafted. If it wasn't something that was known about since the RFC, it would probably have taken a lot longer for someone to find it.

kentonv

3 days ago

Luckily, shareholders do not decide who to hire and fire. The actual officers of the company, hopefully, understand why senior engineers are non-fungible. Tech companies, at least, seem to understand this well. (I do think a lot of non-tech companies that nevertheless have software in their business get this wrong, and that's why we see a lot of terrible software out there. E.g. most car companies.)

As for junior engineers, despite the comparisons in coding skill level, I don't think most people are suggesting that AI should replace junior engineers. There are a lot of things humans do which AI still can't, such as seeing the bigger picture that the code is meant to implement, and also, as you note, learning over time. An LLM's consciousness ends with its context window.

dawnbreez

2 days ago

Apparently I missed the end of a sentence near the end there. "But because" on the fourth paragraph is supposed to be "but because the sales pitch is that the machine can replace me". Oops.

GenshoTikamura

5 days ago

[flagged]

dang

5 days ago

Could you please stop breaking the site guidelines, as you did here and in https://news.ycombinator.com/item?id=44167429? If you keep posting like that we're going to have to ban you again. I don't want to do that, but we can't have commenters abusing HN like this.

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it. Note these, for example:

"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."

"When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."

"Please don't fulminate."

https://news.ycombinator.com/newsguidelines.html

Ragnarork

5 days ago

It's interesting also how these takes consistently ignore spectacularly the environmental cost of these as well.

I'm a bit on the fence myself, as I think it's very harmful, but I can also see ways it can be useful. But it's absolutely mindblowing how this is nearly always completely out of the discussion even though our current way of living and powering things is on a timer and we still haven't addressed it as a whole.

rel_ic

5 days ago

Thanks for posting this! I was also flabbergasted and it's nice to see at least one mention of the huge environmental impact!

slashdev

5 days ago

Do you know what the environmental footprint of a software developer is?

I’m only half joking.

jes5199

5 days ago

the “environmental costs” argument is based upon like fifteen false assumptions. by 2030, AI will be running almost entirely on solar+wind+battery, because that is by far the cheapest option

Nullabillity

5 days ago

> by 2030, AI will be running almost entirely on solar+wind+battery

Bullshit.

Even if this was true (and so far that doesn't seem to be the case), that's not how commodities work.

You can't just measure how much your thing uses, because even if it was running purely on green energy you also have to keep in mind other consumers that end up displaced onto worse sources. The only fair way to measure this is by the worst generator on your grid, because that's who would be shut down w/o your demand.

And even if we assume that the entire grid was green, building out that capacity also has (environmental) costs! As does producing those GPUs, for that matter.

jes5199

5 days ago

there will be _less_ fossil fuel use, globally, in 2030 than there is today, and AI will not be a significant reason that changes in either direction.

wvenable

5 days ago

What's the environmental cost a human doing the same thing?

GardenLetter27

5 days ago

What environmental cost? The uranium is going to decay on its own anyway, we might as well use that energy for useful things.

Maybe LLMs will help lead to a breakthrough in nuclear fusion or battery research.

Degrowth is the philosophy of poverty, and a poverty of philosophy.

timdiggerm

5 days ago

Degrowth and "Maybe this uses too much electricity" are not the same thing, particularly when a nontrivial portion of US generation is fossil-fuel based.

As for the breakthroughs, maybe they will, maybe they won't; it's not much of an argument.

rel_ic

5 days ago

This mindset is cancerous

GuinansEyebrows

5 days ago

endless growth is the philosophy of an economic system that does not regard the long-term health of the planet and its flora and fauna. what's your point?

v3xro

5 days ago

Yes pretty much spot on. And if he thinks that

> they’re smarter than me. And when they get over this affectation, they’re going to make coding agents profoundly more effective than they are today.

Then I really hope he's in for a surprise.

hlpn

5 days ago

Rage bait and little more.

rfrey

5 days ago

your opinion on that sidebar issue.

I for one am glad not everybody writes their op-eds while cosplaying as an HR-drone writing press releases.

AnimalMuppet

5 days ago

False dichotomy. One can at the same time write not as an HR drone, and also not as an aggressive jerk.

kmeisthax

5 days ago

in re: "but the plagiarism"

As someone who thinks modern copyright law is unfit for purpose, and more specifically that software shouldn't be copyrightable at all, my main complaint about LLMs is the same as Chinese spying and IP theft: I only care because they don't share.

I'm perfectly fine using open-weights LLMs[0] to do things LLMs are good at. But I would never trust using OpenAI or Perplexity's SaaSS models as a replacement for my own skills. Half the reason they put things behind a server is so they can spy on you[1], the other half is so they can promise a revenue stream to the people they plan to sell the company to so they can enshittify it. The only enshittification-resistant software is software people can't charge you for.

All the other things this blog post is complaining about are real reactions to real harms, even in the FOSS community. People use LLMs as skilled labor substitute and get burned, while creating a spam problem for everyone else. In fact, that's the real replacement threat for artists: diffusion models don't so much outperform human artists as much as they convince people not to pay for art.

[0] I don't give two flying fucks about the license terms.

[1] The judge in the NYT lawsuit is even ordering spying for evidentiary reasons

mgraczyk

6 days ago

We are beyond the point of trying to convince naysayers.

I will simply not hire anybody who is not good at using LLMs, and I don't think I would ever work with anybody who thinks they aren't very useful. It's like working with somebody who things compilers are useless. Obviously wrong, not worth spending time trying to convince.

To anyone who reads this article and disagrees with the central point: You are missing the most important thing that will happen in your career. You should reevaluate because you will be unemployable in a few years.

andrepd

6 days ago

I don't think most people with mixed feelings in LLMs (or heretic naysayers as you put it) would want to work in a place like that, so perhaps you are doing everyone a favour!

sanderjd

6 days ago

I think this is a reasonable response. But I also think it's worth taking the parent's compiler analogy seriously as a thought experiment.

Back when I was in college in the 00s, if I had developed a preference for not using compilers in my work, I might have been able to build a career that way, but my options would have been significantly limited. And that's not because people were just jerks who were biased against compiler skeptics, or evil executives squeezing the bottom line, or whatever. It's because the kind of software most people were making at that period of time would have been untenable to create without higher level languages.

In my view, we clearly aren't at this point yet with llm-based tooling, and maybe we never will be. But it seems a lot more plausible to me that we will than it did a year or even six months ago.

mgraczyk

6 days ago

It reminds me of many of the people I worked with early in my career.

They were opposed to C++ (they thought C was all you need), opposed to git (they used IBM clearcase or subversion), opposed to putting internal tools in a web browser (why not use Qt and install the tool), opposed to using python or javascript for web services (it's just a script kiddie language), opposed to sublime text/pycharm/vscode (IDEs are for people who don't know how to use a CLI).

I have encountered it over and over, and each time these people get stuck in late career jobs making less than 1/3 of what most 23 year old SWEs I know are making.

candiddevmike

6 days ago

At some point, folks just want stability. I don't think you're at that point in your career, but the technology treadmill eventually burns everyone. Ironically, you're most likely going to use GenAI to counteract what is the same scenario (learning GenAI means I never have to learn again).

mgraczyk

6 days ago

Yeah of course, and that sounds like a great hobby you can have (like the woodworking example in the article).

But if you expect to get paid, you need to keep up and stay productive.

And it doesn't burn everyone out. All of the best 50+ year old engineers I know use LLMs constantly.

SirHumphrey

6 days ago

They were probably also opposed to some other things that failed.

But then hindsight is 20/20.

sanderjd

6 days ago

Yes, but honestly, I was this way at the beginning of my career, and I can't think of any examples of things I was right about.

My most successful "this is doomed to fail" grouchiness was social media games (like Farmville).

But I just can't think of any examples in the dev tooling space.

mgraczyk

6 days ago

Being opposed to things that fail is not important.

You can rightly avoid new things 99% of the time, but if you miss the 1% of things that matter, you get left behind.

On the other hand if you adopt the latest thing 100% of the time and 99% of those things are a waste, you will probably be fine.

recursive

5 days ago

> You should reevaluate because you will be unemployable in a few years.

When this happens, then I'll figure out how to get good at prompting.

aykutcan

6 days ago

> Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.

Ughh. That "own time." Spoken like a true middle manager who thinks passion is a liability.

gyomu

6 days ago

The author also wrote

“Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed.”

This tells you all you need to know about the kind of work the author values.

Next time you wonder why so much software is buggy, slow, half considered slop, you’ll know exactly who it was made by.

dbalatero

5 days ago

As I recall, fly.io was going down an awful lot last time I peeked in.

donatj

6 days ago

And this goes at least part of the way towards explaining why Fly.io has been the second least reliable cloud provider I've ever used, after Azure.

dang

5 days ago

"Don't be snarky."

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

donatj

5 days ago

I apologize. I genuinely did not mean snark, nor to be dismissive. It was a sincere honest-to-god "oh wow this actually explains the situation over there" mixed with an expression of frustration with their service quality.

A lot of their pages barely work and feel like first drafts, much of the UI is even labeled as a work in progress, and manytools regularly just fail to do what they're supposed to.

Deployments are routinely broken, with Twitter lit up like a light bulb about it while their status page remains all services operational.

I deploy three completely separate applications with Fly.io and have regular unscheduled downtime.

I stick with them despite all this because they're cheap and easily scale regular Go apps down to zero which is wonderful for apps that see intermittent/irregular usage patterns.

The fact that they are adamant defenders of vibe coding going so far in the post as to say "who cares about hallucinations" genuinely explains a fair deal about their unprofessional feeling service.

Mofpofjis

5 days ago

quick, the address of a burn clinic

theusus

5 days ago

Numerous words to make a very small point. Also, a lot of colloquial words used that made it difficult to comprehend the sentences.

dnoberon

6 days ago

This reads even more like an angry teenager than my angsty high school diary. I'm not sure how many more strawmans and dismissive remarks I can handle in one article.

dang

5 days ago

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

"When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."

https://news.ycombinator.com/newsguidelines.html

debugnik

5 days ago

Is there a similar rule for submissions, or are submitters exempt from adopting HN culture? "Please don't submit shallow arguments, specially criticizing other people's work?"

Because we've recently been getting a series of low quality submissions which obviously drive low quality discussion, but for some reason it's only the commenters who get reprimanded, as if the submissions weren't ruining the mood one by one.

(And to clarify, I haven't been warned, I'm not writing this out of spite.)

dang

5 days ago

You can flag submissions that you think don't belong on Hacker News. Other than that it's best to just turn your attention to other stories that do interest you.

puttycat

6 days ago

The language in this post is just terrible.

davidgerard

5 days ago

fly.io is of course closed source, right?

so we can't look at the code and see the demonstrable engineering excellence

trust me bro, you fool, you poltroon,

simonw

5 days ago

Their client application (written in Go) is open source: https://github.com/superfly/flyctl

They have a few other notable open source components, I think these two are particularly interesting:

- https://github.com/superfly/corrosion

- https://github.com/superfly/litefs

(Disclosure: they sponsor some of my work.)

mindwok

5 days ago

I am pretty exhausted with reading the cynical takes on AI lately. Like, I get it, but ugh.

Yes, the tech isn't perfect yet. Yes, it means existing industries will be displaced. Yes, it threatens to undermine the identities we have built as useful artists, programmers, data analysts, whatever. It is a scary, uncertain future, and on some days I am overcome with dread about what it means for me.

But ffs, we will figure it out. This technology, if it continues to improve at the pace it has, will open the door to us solving previously intractable problems - problems like curing cancer, endless energy and nuclear fusion, making us 100x more productive at building great housing and infrastructure. Let's get on board and make this thing a success, not fight it every step of the way because we want to preserve some basically arbitrary state of society.

millipede

5 days ago

"AI is Amazing" - People who barely write any code.

bitpush

6 days ago

A writing for the ages. I've found most of the LLM skeptics are either being hypocritical or just being gate-keepy (we dont want everyone to write code)

superconduct123

5 days ago

I find the AI proponents have an insane level of egocentrism

They cannot possibly imagine someone has a different use case where the AI didn't work

"I crank out shitty webapps all day, therefore every single other dev does. Everyone obviously has the same use case as me because I am the center of the universe"

sneak

6 days ago

THANK YOU.

I was a 3-4x programmer before. Now I’m a 9-15x programmer when wrangling LLMs.

This is a sea change and it’s already into “incredible” territory and shows no signs of slowing down.

> Think of anything you wanted to build but didn’t. You tried to home in on some first steps. If you’d been in the limerent phase of a new programming language, you’d have started writing. But you weren’t, so you put it off, for a day, a year, or your whole career.

I have been banging out little projects that I have wanted to exist for years but always had on the back burner. Write a detailed readme and ask the agent to interrogate you about the missing parts of the spec then update the README. Then have it make a TODO and start implementing. Give it another code base for style guide.

I’ve made more good and useful and working code in the last month than I have in the last two years.

FridgeSeal

6 days ago

That’s nothing, I was a 4X programmer and now I’m a 500x programmer!

I don’t just run one agent, I run all of them!

My time to close tickets is measured in minutes!

I don’t even review code, I have a different agent review it for me!

ofjcihen

6 days ago

And to make sure that agent doesn’t make mistakes I have a different agent review that agents work!

tptacek

6 days ago

Why would you do this and not just read the code yourself?

grey-area

6 days ago

Well given reading code is more tedious than writing it and the author of this article claims gai is most useful for tedious or repetitive code, why would you want to read it? Since this AI agent understands and reasons about the text it writes and reads it should be pretty infallible at checking the code too, right?

Just get another agent to review it and merge it, job done.

tsimionescu

6 days ago

Plus, the sales agents are running to promote the finished product to other companies and close deals, and the accounting agents are checking that all expenses are being accounted for and we have a positive cash flow. Obviously, the audit agents are checking that no errors sneak into this process, according to a plan devised by the legal agents.

yifanl

6 days ago

Because at thats what you need to do to get a 943x coder black belt

hooverd

6 days ago

why not use AI to summarize the code for you?

happytoexplain

6 days ago

The parent neglected to add /s

sneak

6 days ago

I can’t tell if you’re being sarcastic or not, but if you are, the real world is not far behind. I can imagine a world where a mixture of AI agents (some doing hypercritical code review) can return you tested and idiomatic PRs faster than you can describe the new architecture in issues.

I think a lot of people are unfamiliar with the (expensive) SOTA.

indigodaddy

6 days ago

Lol omg I guess your original comment was NOT sarcastic!?

nico

6 days ago

I'm not sure about giving specific metrics or kpis of efficiency or performance

It definitely feels different to develop using LLMs, especially things from scratch. At this point, you can't just have the LLM do everything. Sooner or later you need to start intervening more often, and as the complexity of the project grows, so does the attention you need to give to guiding the LLM. At that point the main gains are mostly in typing and quickly looking some things up, which are still really nice gains

ryandrake

5 days ago

I hate how the discourse around LLM-assisted programing is so polarized. It's either detractors saying it's "a fad that's useless and going nowhere, wasting billions of megawatts every year" or it's true believers calling it "the most miraculous sea change technological advancement in my lifetime" or "more important than fire and electricity[1]." There just doesn't seem to be any room in the middle.

I tried out Copilot a few months back to see what all the fuss was about and so that I could credibly engage with discussions having actually used the technology. I'd rate it as "kind of neat-o" but not earth shattering. It was like the first time I used an IDE with auto-complete. Oh, cool, nice feature. Would I pay monthly for it? No way. Would I integrate it into my development workflow if it were free? Maybe, I guess? Probably wouldn't bother unless it came literally set up for me out of the box like autocomplete does nowadays.

Don't get me wrong--it's cool technology. Well done, AI people. Is it "the 2nd most important thing to happen over the course of my career" as OP wrote? Come on, let's come down to earth a little.

1: https://www.cnbc.com/2018/02/01/google-ceo-sundar-pichai-ai-...

kaydub

5 days ago

I don't know. I think 9-12 months ago I'd agree with you. But I feel like the last 6 months my productivity has vastly improved. Not only that, it's also brought back a little bit of passion for the field.

It's easy to come up with some good ideas for new project, but then not want to do a lot of the garbage work related to the project. I offload all that shit to the LLM now.

Seriously, the LLMs have increased my productivity 2-4x.

sneak

5 days ago

Copilot is a bad yardstick. The article literally addresses exactly this. It’s not just “cool technology”, that’s the point. It enables things that were previously impossible.

I spent $600 on claude via cursor last month and it was easily worth 2-3x that.

ryandrake

5 days ago

Since the "state of the art" seems to change every week, what's a good way to try out the current "state of the art", without spending $600? I'd love to give it a shot and be proven wrong, but I'm not going to spend 1/2 a mortgage payment on a trial.

EDIT: Looks like the "Cursor" thing has a free trial. Might start there.

sneak

4 days ago

I spent $600 because I did something like 5200 uses.

You can start off for much less. I recommend trying claude-4-opus max/thinking. There might be cheaper options but that’s the one that has given me the best results so far.

MegaButts

6 days ago

> I was a 3-4x programmer before. Now I’m a 9-15x programmer

What the fuck does this mean?

throwawayqqq11

6 days ago

It means cranking out hello world even faster i guess. I wonder how complex all these projects are people are proud to have completed with the help of AI.

wvenable

5 days ago

I don't use AI to crank out complex parts of projects -- I use to crank out the tedious straight forward stuff that takes a lot of time that is necessary but low-value. Then I'm freed up to work on the hard and interesting stuff.

kaydub

5 days ago

It can honestly do a lot of complex stuff. But sometimes you have to guide it there.

mouse_

6 days ago

Nerds got taken aside and talked to about how it's not nice or cool to brag about IQ score so they invented a new artificial metric to brag about.

hansvm

6 days ago

It depends on the value of x. I think it's safe to assume x <= 0.75, else they'd contribute negatively to their teams (happens from time to time, but let's be generous). Previously they'd be anywhere from a 0/10 to 3/10 programmer, and now they get up to 9/10 on a good day but sometimes are a net negative, as low as -2.25/10 on a bad day. I imagine that happens when tired or distracted and unable to adequately police LLM output.

sneak

6 days ago

It’s a riff on the “10x programmer” concept. People who haven’t worked with 10x programmers tend to not believe they exist.

I’m nowhere near that, but even unaided I’m quite a bit faster than most people I’ve hired or worked with. With LLMs my high quality output has easily tripled.

Writing code may be easier than reading it - but reading it is FASTER than writing it. And that’s what matters.

surgical_fire

6 days ago

0x3 and 0x15 is the same value.