Rochus
3 days ago
The article claims, that senior developers with over 10 years of experience are more than twice as likely to heavily rely on AI tools compared to their junior counterparts. No p-values or statistical significance tests are reported in either The Register article or Fastly's original blog post.
I have over 30 years of experience and recently used Claude Opus 4.1 (via browser and claude.ai) to generate an ECMA-335 and an LLVM code generator for a compiler, and a Qt adapter for the Mono soft debugging protocol. Each task resulted in 2-3kLOC of C++.
The Claude experience was mixed; there is a high probability that the system doesn't respond or just quickly shows an overloaded message and does nothing. If it generates code, I quckly run in some output limitation and have to manually press "continue", and then often the result gets scrambled (i.e. the order of the generated code fragments gets mixed up, which requires another round with Claude to fix).
After this process, the resulting code then compiled immediately, which impressed me. But it is full of omissions and logical errors. I am still testing and correcting. All in all, I can't say at this point that Claude has really taken any work off my hands. In order to understand the code and assess the correctness of the intermediate results, I need to know exactly how to implement the problem myself. And you have to test everything in detail and do a lot of redesigning and correcting. Some implementations are just stubs, and even after several attempts, there was still no implementation.
In my opinion, what is currently available (via my $20 subscription) is impressive, but it neither replaces experience nor does it really save time.
So yes, now I'm one of the 30% seniors who used AI tools, but I didn't really benefit from them in these specific tasks. Not surprisingly, also the original blog states, that nearly 30% of senior developers report "editing AI output enough to offset most of the time savings". So not really a success so far. But all in all I'm still impressed.
epolanski
3 days ago
Imho your post summarizes 90% of the posts I see about AI coding on HN: not understanding the tools, not understanding their strenghts and weaknesses, not being good at prompting or context management yet forming strong(ish) opinions.
If you don't know what they are good at and how to use them of course you may end up with mixed results and yes, you may waste time.
That's a criticism I have also towards AI super enthusiasts (especially vibe coders, albeit you won't find much here), they often confuse the fact that LLMs often one shot 80% of the solutions with the idea that LLMs are 80% there, whereas the Pareto principle well applies to software development where it's the hardest 20% that's gonna prove difficult.
Rochus
3 days ago
I'm pretty good at prompting and I successfully use Perplexity (mostly with Claude Sonnet 4) to develop concepts, sometimes with the same session expanded over several days. I think the user interface is much superior over Claude.ai. My hope was that the newer Claude Opus 4.1 would be much better in solving complicated coding tasks, which doesn't seem to be the case. For this I had to subscribe to claude.ai. Actually I didn't see much difference in performance, but a much worse UI and availability experience. When it comes to developing a complex topic in a factual dialogue, Claude Sonnet Thinking seems to me to be even more suitable than Claude Opus.
epolanski
3 days ago
I'll be more detailed in my second reply.
1) Your original post asks a lot if not too much out the LLM, the expectation you have is too big, to the point that to get anywhere near decent results you need a super detailed prompt (if not several spec documents) and your conclusion stands true: it might be faster to just do it manually. That's the state of LLMs as of today. Your post neither hints at such detailed and laborious prompting nor seem to recognize you've asked it too much, displaying that you are not very comfortable with the limitations of the tool. You're still exploring what it can and what it can't do. But that also implies you're yet not an expert.
2) The second takeaway that you're not yet as comfortable with the tools as you think you are is clearly context management. 2/3k locs of code are way too much. It's a massive amount of output to hope for good results (this also ties with the quality of the prompt, with the guidelines and code practices provided, etc, etc).
3) Neither 1 or 2 are criticisms of your conclusions or opinions, if anything, they are confirmations of your point that LLMs are not there. But what I disagree with is the rush into concluding that AI coding provides net 0 benefits out of your experience. That I don't share. Instead of settling on what it could do (help with planning, writing a spec file, writing unit tests, providing the more boilerplate-y part of the code) and use the LLM to reduce the friction (and thus provide a net benefit), you essentially asked it to replace you and found out the obvious: that LLMs cannot take care of non-trivial business logic yet, and even when they can the results are nowhere near being satisfactory. But that doesn't mean that AI-assisted coding is useless and the net benefit is 0, or negative, it only becomes so as the expectations on the tool are too big and the amount of information provided is either too small to return consistent results or too large for the context to be an issue.
Rochus
3 days ago
I don't know where your confidence or assumptions come from. Do you work for Anthropic? My prompts for the code generators included an 1.2kLOC code file plus detailed instructions (as described elsewhere), with more details during the session. So I don't think your points apply.
throwaway346434
3 days ago
This is a kind of nuts take; - Senior engineer - Uses tools for non trivial undertaking - Didn't find value in it
Your conclusion from that is "but they are doing it wrong", while also claiming they are saying things they didn't say (0 net benefits, useless, etc).
Do you see how that might undermine your point? That you feel they haven't take the time to understand the tools, but you didn't actually read what what wrote?
mihaaly
3 days ago
How do you know that your humble opinion is right about who knows what tool and how deep?
Even if you know better than themselves how musch they know, isn't the tool inadequate just yet for power use then when it is sooo easy to misuse?
Too much tweeking and adapting users to the needs of the tool (vs. the other way around) and there is little point using those (which is a bit of the sickness of modern day computing: 'with computers you can solve problems lightning fast that you wouldn't have without them')
handoflixue
3 days ago
Would you agree with the claim that emacs/vim is an inadequate tool, since it has such a high learning curve?
Prior to LLMs, my impression was "high learning curve, high results" was a pretty popular sweet-spot with a large portion of the tech crowd. It seems weird how much LLMs seem to be an exception to this.
gammarator
3 days ago
Emacs and vim have complex interfaces that have been stable for decades. Seems like every new flavor of LLM requires learning its warts and blind spots from scratch.
cztomsik
2 days ago
The situation has improved a little bit over the last few months but LLMs are still only barely usable in languages like C/C++/Zig - and it's not about prompting. I would say that LLMs are usable for JS/Python and while the code is not always what I'd write myself, it can be used and improved later (unless you are working on perf-sensitive JS app, then it's useless again).
And it might be also something with GC, because I suppose the big boys are doing some GRPO over synthetically generated/altered source code (I would!) but obviously doing that in C++ is much more challenging - and I'd expect Rust to be straight impossible.
oliwary
3 days ago
Hey! I would encourage you to try our Claude code instead, which is also part of your subscription. It's a CLI that takes care of many of the issues you encountered, as it works directly on the code files in a directory. No more copy pasting or unscrambling results. Likewise, it can run commands itself to e.g. compile or even test code.
Rochus
3 days ago
I'm working on old hardware and not-recent Linux and compiler versions, and I have no confidence yet in allowing AI direct (write) access to my repositories.
Instead I provided Claude with the source code of a transpiler to C (one file) which is known to work, uses the same IR as the new code generators were supposed to use.
This is a controlled experiment with a clear and complete input and clear expectations and specifications of the output. I don't think I would be able to clearly isolate the contributions and assess the performance of Claude when it has access to arbitrary parts of the source code.
stavros
3 days ago
I use Claude Code with the Max plan, and the experience isn't far off from what you describe. You still need to understand the system and review the implementation, because it makes many mistakes.
That's not the part it saves me time in, it saves me time in looking up the documentation. Other than that, it might be slower, because the larger the code change is, the more time I need to spend reviewing, and past a point I just can't be bothered.
The best way I've found is to have it write small functions, and then I tell it to compose them together. That way, I know exactly what's happening in the code, and I can trust that it works correctly. Cursor is probably a better way to do that than Claude Code, though.
t_mahmood
3 days ago
So, I am paying $20, for a glorified code generator, that may or may not be correct, to write a small function that I can do for free, and be confident about the correctness, if I have not been lazy to implement a test for it.
If you point out, with test it's also the same with any AI tool available, but to come to that result, I have to continuously prompt it till it gives me the desired output, while I may be able to do it in 2/3 iterations.
Reading documentation always made me little bit knowledgeable than before, while prompting the LLM, gives me nothing of knowledge.
And, I also have to decide which LLM would be good for the task at hand, and most of them will not be free (unless I use a local, but that will also use GPU, and add an energy cost)
I may be nitpicking, but I see too many holes with this approach
stavros
3 days ago
The biggest hole you don't see is that it's worth the $20 to make me overcome my laziness, because I don't like writing code, but I like making stuff, and this way I can make stuff while fooling my brain into thinking I'm not writing code.
t_mahmood
3 days ago
Sure, that can be a point, which is helping you overcome your personal barrier, But that can be anything,
That is not you were vouching for on the original comment. It was about saving time.
weard_beard
3 days ago
Not only that, but the process described is how you train a junior dev.
There, at least, the wasted time results in the training of a human being who can become sophisticated enough to become a trusted independent implementer in a relatively short duration
turtlebits
3 days ago
Your time isn't free, and I'd certainly with more than $20/month.
I find it extremely useful as a smarter autocomplete, especially for the tedious work - changing function definitions, updating queries when DB schema changes, and writing http requests/api calls from vendor/library documentation.
t_mahmood
3 days ago
Certainly, So I use an IDE, IntelliJ Ultimate to be precise.
None of the use-cases you mention requires LLM. Just available as IDE functionalities.
IntelliJ has LLM based auto complete, with which I am okay, But it still wrong too many times. Works extremely well with Rust. Their non-llm autocomplete is also superb, which uses ML for suggesting closest, relevant match, IIRC.
It also makes refactoring a breeze, I know what it's going to do exactly.
Also, it can handle database refactoring to a certain capacity! And for that it does not require LLM, so no nondeterministic behavior.
Also, the IDE have its own way of doing http requests, and it's really nice! But, I can use their live template to do autocomplete any boilerplate code. It only requires setting once. No need to fiddle with prompts.
mattacular
3 days ago
> The best way I've found is to have it write small functions, and then I tell it to compose them together.
Pretty much how I code without AI, except making my brain break the problem down into small functions and expressing them in code rather than as a chat.
Rochus
3 days ago
> it saves me time in looking up the documentation
I have a Perplexity subscription which I heavily use for such purpose, just asking how something works or should be used, with a response just on the point and with examples. Very useful indeed. Perplexity gives me access to Claude Sonnet 4 w/o Thinking which I consider great models, and it can also generate decent code. My intention was to find out how good the recent Claude Opus is in comparison and how much of my work I'm able to delegate. Personally I much prefer the user interface features, performance and availability of Perplexity to Claude.ai.
gommm
3 days ago
I end up using Perplexity a lot too, especially when I'm doing something unfamiliar. It's also a good way to quickly find out what are best practices for a given framework/language I'm not that familiar with (I usually ask it to link to examples in the wild and it find opensource projects illustrating those points)
stavros
3 days ago
I have both, and Perplexity is much more like a search engine than a chat companion (or at least that's how I use it). I like both, though.
Rochus
3 days ago
You can select the model. I very much appreciate the Claude Sonnet models which are very good and rational discussion partners, responding to arguments in detail and critically, allowing for the dialectical exploration of complex topics. I have also experimented with other models including ChatGPT, Gemini or Grok, but the resulting discussions were only a fraction as useful (i.e. more optimized towards affirmative feel-good small talk, from my humble point of view).
stavros
3 days ago
Hmm, I've never tried that, even though I prefer Claude in general too. I'll try that, thanks!
fluidcruft
3 days ago
claude-code asks you to allow it to do anything before it does them. Once you start trusting it and get comfortable with its behavior it gets annoying being prompted all the time, so you can whitelist specific commands it wants to run. You can also interactively toggle into (and out of) "accept changes without asking" mode.
(It wasn't clear to me that I would be able to toggle out of accept changes mode, so I resisted for a loooooong time. But turns out it's just a toggle on/off and can be changed in real-time as it's chugging along. There's also a planning state but haven't looked into that yet)
It always asks before running commands unless you whitelist them. I have whitelisted running testsuites and linters, for example so it can iterate on those corners with minimal interaction. I have had to learn to let it go ahead and make small obvious mistakes rather than intervene immediately because the linters and tests will catch them and Claude will diagnose the failure and fix them at that point.
Anyway I took a small toy project and used that to get a feel for claude-code. In my experience using the /init command to create CLAUDE.md (or asking Claude to interview you to create it) is vital for consistent behavior.
I haven't had good "vibe" experiences yet. Mostly I know what I want to do and just basically delegate implementation. Some things that have worked well for me is to ask Claude to propose a few ways to improve or implement a feature. It's come up with a few things I hadn't thought of that way.
Anyway, claude-code was very good at slowly and incrementally earning my trust. I resisted trying it because I expected it would just run hogwild doing bewildering things, but that's not what it does. It tends to be a bit of an asskisser in it's communication style in a way that would annoy me if it were a real person. But I've managed to look past that.
kace91
3 days ago
On Claude you specifically accept any attempt to use a terminal command (optionally whitelisting) so there’s no risk that it will push force something or whatever. You can also whitelist with granularity, for example to enable it to use git to view git logs but not commit.
You can just let it work, see what’s uncommitted after it’s over, and get rid of the result if you don’t like it.
kelnos
3 days ago
> I have no confidence yet in allowing AI direct (write) access to my repositories.
You don't need to give it write access to your repositories, just to a source tree.
boesboes
3 days ago
I've been trying it for a couple of months, I can't recommend it either tbh. It's frustrating as hell to work with: super inconsistent, very bad at following its own instructions, wasteful and generally unreliable.
The problem is, it's like a very, very junior programmer that knows the framework well, but won't use it consistently and doesn't learn from mistakes AT ALL. And has amnesia. Fine for some trivial things, but anything more complicated the hand-holding becomes so involved you are better off doing it yourself. That way you internalise some of the solutions as well, which is nice because then you can debug it later! Now I have a huge PR that even I myself don't really grasp as much as I would want.
But for me the nail in the coffin was the terrible customer service. ymmv.
jennyholzer
3 days ago
[flagged]
Rochus
3 days ago
Do you mean Claude code should fail? Why?
jennyholzer
3 days ago
[flagged]
Rochus
3 days ago
In what specific programming language/toolchain/technology is your experience? Why do you think that "everybody can tell that chat gpt wrote your code"? Meanwhile I looked at a lot of LLM generated code in different languages, and I wouldn't generally subscribe to your statement. And you still didn't explain why Claude should fail. I think it is rather an advantage (when it reliably works in future).
blks
3 days ago
I can tell when my coworkers Go code is generated by an llm. I hate it very much.
actionfromafar
3 days ago
Wow, shots fired! Would you add something to that?
jennyholzer
3 days ago
i spend a lot of time fixing unacceptably poor code that LLM platforms have tricked human coworkers into finding adequate.
my coworkers are increasingly ignorant about the software products they work on.
LLM-informed software development is organizationally poisonous.
Businesses selling LLM coding tools occupy the same place in my mind as drug dealers.
kelnos
3 days ago
Feels like that's more of a problem with the competence of your coworkers than with the LLM. The LLM is just exposing how bad they are.
furyofantares
3 days ago
I'm just shy of 30 years experience. I think I've spent more time learning how to use these tools than any other technology I've learned, and I still don't know the best way to use them.
They certainly weren't a time-saver right away but they became one after some time giving them a real shot. I tested their + my limits on small projects, working out how to get them to do the whole project, figuring out when they stop working and why, figuring out which technology they work best with, figuring out the right size problems to give them, figuring out how to recognize if I'm asking them something they can't do well a ask something different instead, guide them into creating code that they can't actually continue to be successful with.
I started last December in Cursor's agentic mode and have been in Claude Code ever since probably March or April. It's definitely been a huge boost all year for side projects - but only in the last couple months have I been having success in a large codebase.
Even with all this experience I don't know that I would really be able to get much value out of the chat interface for them. They need to be proposing changes I can just hit accept or reject on (this is how both Claude Code and Cursor work btw - you don't have to allow it to write to any file you don't want or execute any command you don't want).
aiiizzz
3 days ago
The fact that you use Claude, not gpt5 which is light-years ahead for coding, tells me all I need to know.
elashri
3 days ago
I don't think it is true that GPT-5 is much better than Claude 4.1. And more important is that light year is distance measure and I am sure openAI data centers is still on earth.
kelnos
3 days ago
I hate to play the "you're holding it wrong" card, but when I started, I had more or less the same experience. Eventually you start to learn how to better talk to it in order to get better results.
Something I've found useful with Claude Code is that it works a lot better if I give it many small tasks to perform to eventually get the big thing done, rather than just dumping the big thing in its lap. You can do this interactively (prompt, output, prompt, output, prompt, output...) or by writing a big markdown file with the steps to build it laid out.
JeremyNT
3 days ago
While this matches my experience, it's worth mentioning thar the act of breaking a task up into the correct chunk size and describing it in English is itself a non trivial task which can be more time consuming than simply writing the actual code.
The fact that it works is amazing, but I'm less convinced that it's enhancing my productivity.
(I think they real productivity boost for me is if I still write the code and have the assistant write test coverage based on diffs, which is trivial to prompt for good results)
kristianbrigman
3 days ago
And one that a lot of people skip, so that forcing function might make for better code, even if it isn’t faster.
chillingeffect
3 days ago
Similar here. AI works much better as a consultant than as a developer. I ask it all kinds of things I have suspicions and intuitions about and it provides clarity and examples. It's great for subroutines. Trying to make full programs is just too large of a space. It's difficult to communicate all the implicit requirements.
jennyholzer
3 days ago
People who consistently consult LLMs for product direction or software feature design overwhelmingly appear to me as willfully ignorant dullards.
I mean it's even further than willful ignorance. It's delight in one's own ignorance.
JeanMarcS
3 days ago
This. For me (senior as in I am in the field since last century), that's how I use it. "I want to do that, with this data, to obtain that"
I still do the part of my job that I got experience on, analyze the need, and use the AI like an assistant to do small libraries or part of code. Like these, errors have less chance to appear. Then I glue that together.
For me the time ratio is best use like that. If I have to describe the whole thing, I'm not far from doing it myself, so there's no need for me.
Important: I work alone, not in a team, so maybe it has an impact on my thought
Rochus
3 days ago
I just tried to use it for something I consider it to provide the most benefit (for my case). Being able to fully delegate a complicated (and boring) part to a machine would give me more time for the things I'm really interested in. I think we are on the right track in this regard, but we still have a long way to go.
deadbabe
3 days ago
It’d be nice if we could “pipe” prompts directly similar to how we pipe multiple Unix commands to eventually get what we really want.
Then we can give someone that entire string of prompts as a repeatable recipe.
kasey_junk
3 days ago
You can send prompts on the command line to Claude, I typically save prompts in the repo. But note it won’t produce deterministic output.