AdieuToLogic
12 hours ago
Perhaps the most telling portion of their decision is:
Quality concerns. Popular LLMs are really great at
generating plausibly looking, but meaningless content. They
are capable of providing good assistance if you are careful
enough, but we can't really rely on that. At this point,
they pose both the risk of lowering the quality of Gentoo
projects, and of requiring an unfair human effort from
developers and users to review contributions and detect the
mistakes resulting from the use of AI.
The first non-title sentence is the most notable to consider, with the rest providing reasoning difficult to refute.JohnBooty
9 minutes ago
There are a number of other issues such the ethical and environmental ones. However, this one in isolation...
Popular LLMs are really great at
generating plausibly looking, but meaningless
content. They are capable of providing good
assistance if you are careful enough
I'm struggling to understand this particular angle.Humans are capable of generating extremely poor code. Improperly supervised LLMs are capable of generating extremely poor code.
How is this is an LLM-specific problem?
I believe part of (or perhaps the entire) the argument here is that LLMs certainly enable more unqualified contributors to generate larger quantities of low-quality code than they would have been able to otherwise. Which... is true.
But still I'm not sure that LLMs are the problem here? Nobody should be submitting unexpected, large, hard-to-review quantities of code in the first place, LLM-aided or otherwise. It seems to me that LLMs are, at worst, exposing an existing flaw in the governance process of certain projects?
jjmarr
10 hours ago
I've been using AI to contribute to LLVM, which has a liberal policy.
The code is of terrible quality and I am at 100+ comments on my latest PR.
That being said, my latest PR is my second-ever to LLVM and is an entire linter check. I am learning far more about compilers at a much faster pace than if I took the "normal route" of tiny bugfixes.
I also try to do review passes on my own code before asking for code review to show I care about quality.
LLMs increase review burden a ton but I would say it can be a fair tradeoff, because I'm learning quicker and can contribute at a level I otherwise couldn't. I feel like I will become a net-positive to the project much earlier than I otherwise would have.
edit: the PR in question. Unfortunately I've been on vacation and haven't touched it recently.
https://github.com/llvm/llvm-project/pull/146970
It's a community's decision whether to accept this tradeoff & I won't submit AI generated code if your project refuses it. I also believe that we can mitigate this tradeoff with strong social norms that a developer is responsible for understanding and explaining their AI-generated code.
totallymike
10 hours ago
How deliciously entitled of you to decide that making other people try to catch ten tons of bullshit because you’re “learning quicker and can contribute at a level you otherwise couldn’t” is a tradeoff you’re happy to accept
If unrepentant garbage that you make others mop up at risk of their own projects’ integrity is the level you aspire to, please stop coding forever.
jjmarr
9 hours ago
I didn't make a decision on the tradeoff, the LLVM community did. I also disclosed it in the PR. I also try to mitigate the code review burden by doing as much review as possible on my end & flagging what I don't understand.
If your project has a policy against AI usage I won't submit AI-generated code because I respect your decision.
h4ny
9 hours ago
> I didn't make a decision on the tradeoff, the LLVM community did. I also disclosed it in the PR.
That's not what the GP mean. Just because a community doesn't disallow something doesn't mean it's the right thing to do.
> I also try to mitigate the code review burden by doing as much review as possible on my end
That's great but...
> & flagging what I don't understand.
It's absurd to me that people should commit code they don't understand. That is the problem. Just because you are allowed to commit AI-generated/assisted code does not mean that you should commit code that you don't understand.
The overhead to others of committing code that you don't understand then ask someone to review is a lot higher than asking someone for directions first so you can understand the problem and code you write.
> If your project has a policy against AI usage I won't submit AI-generated code because I respect your decision.
That's just not the point.
overfeed
6 hours ago
> It's absurd to me that people should commit code they don't understand
The industrywide tsunami of tech debt arising from AI detritus[1] will be interesting to watch. Tech leadership is currently drunk on improved productivity metrics (via lines of code or number of PRs), but I bet velocity will slow down, and products be more brittle due to extraneous AI-generated, with a lag, so it won't be immediately apparent. Only teams with rigorous reviews will fare well in the long term, but may be punished in the short term for "not being as productive" as others.
1. From personal observation: when I'm in a hurry, I accept code that does more than is necessary to meet the requirements, or is merely not succinct. Where as pre-AI, less code would be merged with a "TBD" tacked on
Phelinofist
6 hours ago
Where did you disclose it?
Sayrus
6 hours ago
Only after getting reviews so it is hidden by default: https://github.com/llvm/llvm-project/pull/146970#issuecommen...
optionalsquid
2 hours ago
Disclosing that you used AI three days after making the PR, after 4 people had already commented on your code, doesn't sit right with me. That's the kind of thing that should be disclosed in the original PR message. Especially so if you are not confident in the generated code
0000000000100
8 hours ago
Go look at the PR man, it's pretty clear that he hasn't just dumped out LLM garbage and has put serious effort and understanding into the problem he's trying to solve.
It seems a little mean to tell him to stop coding forever when his intentions and efforts seem pretty positive for the health of the project.
thesz
7 hours ago
One of resolved conversation contains a comment "you should warn about incorrect configuration in constructor, look how it is done in some-other-part-of-code."
This means that he did not put serious effort into understanding what, when and why others do in a highly structured project like LLVM. He "wrote" the code and then dumped "written" code into community to catch mistakes.
StopDisinfo910
39 minutes ago
Have you ever contributed to a very large project like LLVM? I would say clearly not from the comment.
There are pitfalls everywhere. It’s not so small that you can get everything in your head with only a reading. You need to actually engage with the code via contributions to understand it. 100+ comments is not an exceptional amount for early contributions.
Anyway, LLVM is so complex I doubt you can actually vibcode anything valuable so there are probably a lot of actual work in the contribution.
There is a reason the community didn’t send them packing. Onboarding new comer is hard but it pays off.
onli
6 hours ago
That is normal for a new contributor. You can't reasonably expect knowledge of all the conventions of the project. There has to be effort to produce something good and not overload the maintainers, I agree, but missing such a detail is not a sign that is not happening here.
anal_reactor
4 hours ago
Every hobby at some point turns into an exclusive, invitation-only club in order to maintain the quality of each individual's contribution, but then old members start to literally die and they're left wondering why the hobby died too. I feel like most people don't understand that any organization that wants to grow needs to sacrifice quality in order to attract new members.
sethammons
3 hours ago
Your final sentence moved me. Moved to flagging the post, that is.
noosphr
9 hours ago
That's no different to on boarding any new contributor. I cringe at the code I put out when I was 18.
On top of all that every open source project has a gray hair problem.
Telling people excited about a new tech to never contribute makes sure that all projects turn into templeOS when the lead maintainer moves on.
totallymike
9 hours ago
Unrelated to my other point, I absolutely get wanting to lower barriers, but let’s not forget that templeOS was the religious vanity project of someone who could have had a lot to teach us if not for mental health issues that were extant early enough in the roots of the project as to poison the well of knowledge to be found there. And he didn’t just “move on,” he died.
While I legitimately do find templeOS to be a fascinating project, I don’t think there was anything to learn from it at a computer science level other than “oh look, an opinionated 64-bit operating environment that feels like classical computing and had a couple novel ideas”
I respect that instances like it are demonstrably few and far between, but don’t entertain its legacy far beyond that.
lelanthran
4 hours ago
> While I legitimately do find templeOS to be a fascinating project, I don’t think there was anything to learn from it at a computer science level other than “oh look, an opinionated 64-bit operating environment that feels like classical computing and had a couple novel ideas”
I disagree, actually.
I think that his approach has a lot to teach aspiring architects of impossibly large and complex systems, such as "create a suitable language for your use-case if one does not exist. It need not be a whole new language, just a variation of an existing one that smooths out all the rough edges specific to your complex software".
His approach demonstrated very large gains in an unusually complicated product. I can point to projects written in modern languages that come nowhere close to being as high-velocity as his, because his approach was fine-tuned to the use-case of "high-velocity while including only the bare necessities of safety."
totallymike
9 hours ago
Onboarding a new contributor implies you’re investing time into someone you’re confident will pay off over the long run as an asset to the project. Reviewing LLM slop doesn’t grant any of that, you’re just plugging thumbs into cracks in the glass until the slop-generating contributor gets bored and moves on to another project or feels like they got what they wanted, and then moves on to another project.
I accept that some projects allow this, and if they invite it, I guess I can’t say anything other than “good luck,” but to me it feels like long odds that any one contributor who starts out eager to make others wade through enough code to generate that many comments purely as a one-sided learning exercise will continue to remain invested in this project to the point where I feel glad to have invested in this particular pedagogy.
noosphr
7 hours ago
>Onboarding a new contributor implies you’re investing time into someone you’re confident will pay off over the long run as an asset to the project.
No you don't. And if you're that entitled to people's time you will simply get no new contributors.
totallymike
24 minutes ago
I’ll grant you that, but at least a new contributor who actually writes the code they contribute has offered some level of reciprocity with respect to the time it takes to review their contributions.
Trying to understand a problem and taking some time to work out a solution proves that you’re actually trying to learn and be helpful, even if you’re green. Using a LLM to generate a nearly-thousand-line PR and yeeting it at the maintainers with a note that says “I don’t really know what this does” feels less hopeful.
I feel like a better use of an LLM would be to use it for guidance on where to look when trying to see how pieces fit together, or maybe get some understanding of what something is doing, and then by one’s own efforts actually construct the solution. Then, even if one only has a partial implementation, it would feel much more reasonable to open a WIP PR and say “is this on the right track?”
MangoToupe
6 hours ago
I think the project and reviewers are both perfectly capable of making their own decisions about the best use of their own time. No need to act like a dick to someone willing to own up to their own behavior.
fuoqi
7 hours ago
Well, some people just operate under the "some of you may die, but it's a sacrifice I am willing to make" principle...
bestham
9 hours ago
IMO that is not your call to make, it is the reviews call to make. It is the reviewers resources you are spending to learn more quickly. You are consuming a “free” resource for personal gain because you feel that it is justified in your particular case. It would likely not scale and grind many projects to a halt at least temporarily if this was done at scale.
ororroro
9 hours ago
The decision is made by llvm https://llvm.org/docs/FAQ.html#id4
BrenBarn
7 hours ago
I would interpret this as similar to being able to take paper napkins or straws at a restaurant. You may be welcome to take napkins, but if you go around taking all the napkins from every dispenser you'll likely be kicked out and possibly they'll start keeping the napkins behind the counter in the future. Similarly if people start treating "you can contribute AI code to LLVM" as "feel free to submit nonsense you don't understand", I would not be surprised to see LLVM change its stance on the matter.
JonChesterfield
19 minutes ago
This is exciting. Thank for for raising the point. I've posted https://discourse.llvm.org/t/our-ai-policy-vs-code-of-conduc... to see what other people think of this. Thank you for your commit, and especially for not mentioning that it's AI generated code that you don't understand in the review, as it makes my point rather more forcefully than otherwise.
AdieuToLogic
8 hours ago
> I've been using AI to contribute to LLVM, which has a liberal policy.
This is a different decision made by the LLVM project than the one made by Gentoo, which is neither right nor wrong IMHO.
> The code is of terrible quality and I am at 100+ comments on my latest PR.
This may be part of the justification of the published Gentoo policy. I am not a maintainer of same so cannot say for certain. I can say it is implied within their policy:
At this point, they pose both the risk of lowering the
quality of Gentoo projects, and of requiring an unfair
human effort from developers and users to review
contributions ...
> LLMs increase review burden a ton ...Hence the Gentoo policy.
> ... but I would say it can be a fair tradeoff, because I'm learning quicker and can contribute at a level I otherwise couldn't.
I get it. I really do.
I would also ask - of the requested changes reviewers have made, what percentage are due to LLM generated changes? If more than zero, does this corroborate the Gentoo policy position of:
Popular LLMs are really great at generating plausibly
looking, but meaningless content.
If "erroneous" or "invalid" where the adjective used instead of "meaningless"?benreesman
3 hours ago
I'm a bit later in my career and I've been involved with modern machine learning for a long time which probably affects my views on this, but I can definitely relate to aspects of it.
I think there are a couple of good signals in what you've said but also some stuff (at least by implication/phrashing) that I would be mindful of.
The reason why I think your head is fundamentally in a good place is that you seem to be shooting for an outcome where already high effort stays high, and with the assistance of the tools your ambition can increase. That's very much my aspiration with it, and I think that's been the play for motivated hackers forever: become as capable as possible as quickly as possible by using every effort and resource. Certainly in my lifetime I've seen things like widely distributed source code in the 90s, Google a little later, StackOverflow indexed by Google, the mega-grep when I did the FAANG thing, and now the language models. They're all related (and I think less impressive/concerning to people who remember pre-SEO Google, that was up there with any LLM on "magic box with reasonable code").
But we all have to self-police on this because with any source of code we don't understand, the abstraction almost always leaks, and it's a slippery slope: you get a little tired or busy or lazy, it slips a bit, next thing you know the diff or project or system is jeopardized, and you're throwing long shots that compound.
I'm sure the reviewers can make their own call about whether you're in an ok place in terms of whether you're making a sincere effort or if you've slipped into the low-integrity zone (LLVM people are serious people), just be mindful that if you want the most out of it and to be welcome on projects and teams generally, you have to keep the gap between ability and scope in a band: pushing hard enough to need the tools and reviewers generous with their time is good, it's how you improve, but go too far and everyone loses because you stop learning and they could have prompted the bot themselves.
thrownawayohman
9 hours ago
Ahhahaha what the fuck. This is what software development has become? Using an LLM to generate code that not only do you not understand, but most likely isn’t even correct, and then shoehorn the responsibility of ensuring it doesn’t break anything onto the reviewer? lol wow
jlebar
9 hours ago
As a former LLVM developer and reviewer, I want to say:
1. Good for you.
2. Ignore the haters in the comments.
> my latest PR is my second-ever to LLVM and is an entire linter check.
That is so awesome.
> The code is of terrible quality and I am at 100+ comments on my latest PR.
The LLVM reviewers are big kids. They know how to ignore a PR if they don't want to review it. Don't feel bad about wasting people's time. They'll let you know.
You might be surprised how many PRs even pre-LLMs had 100+ comments. There's a lot to learn. You clearly want to learn, so you'll get there and will soon be offering a net-positive contribution to this community (or the next one you join), if you aren't already.
Best of luck on your journey.
jjmarr
9 hours ago
Thanks. I graduated 3 months ago and this has been a huge help.
thesz
7 hours ago
> You might be surprised how many PRs even pre-LLMs had 100+ comments
What about percentages?close04
6 hours ago
> They know how to ignore a PR if they don't want to review it
How well does that scale as the number of such contributions increases and the triage process itself becomes a sizable effort?
LLMs can inadvertently create a sort of DDoS even with the best intentions, and mitigating it costs something.
sampullman
5 hours ago
Wait and see, then change the policy based on what actually happens.
I sort of doubt that all of a sudden there's going to be tons of people wanting to make complex AI contributions to LLVM, but if there are just ban them at that point.
yeasku
4 hours ago
It has happend to Curl.
29athrowaway
9 hours ago
LLMs trained on open source make the common mistakes that humans make.
wobfan
2 hours ago
> make.
No, made. Which is a very important difference.
paulcole
10 hours ago
How is it telling at all?
It’s just what every other tech bro on here wants to believe, that using LLM code is somehow less pure than using free-range-organic human written code.
Kwpolska
7 hours ago
Tech bros want the exact opposite, so that they can sell their AI crap and replace human developers with AI bots.