efitz
a day ago
There is a general problem with rewarding people for the volume of stuff they create, rather than the quality.
If you incentivize researchers to publish papers, individuals will find ways to game the system, meeting the minimum quality bar, while taking the least effort to create the most papers and thereby receive the greatest reward.
Similarly, if you reward content creators based on views, you will get view maximization behaviors. If you reward ad placement based on impressions, you will see gaming for impressions.
Bad metrics or bad rewards cause bad behavior.
We see this over and over because the reward issuers are designing systems to optimize for their upstream metrics.
Put differently, the online world is optimized for algorithms, not humans.
noobermin
a day ago
Sure, just as long as we don't blame LLMs.
Blame people, bad actors, systems of incentives, the gods, the devils, but never broach the fault of LLMs and their wide spread abuse.
miki123211
a day ago
LLMs are tools that make it easier to hack incentives, but you still need a person to decide that they'll use an LLM t do so.
Blaming LLMs is unproductive. They are not going anywhere (especially since open source LLMs are so good.)
If we want to achieve real change, we need to accept that they exist, understand how that changes the scientific landscape and our options to go from here.
noobermin
20 hours ago
everyone keeps claiming "they're here to stay" as if it's gospel. this constant drumbeat is rather tiresome and without much hard evidence.
LunaSea
15 hours ago
Genuinely curious, did we ever manage to ban a piece of technology worldwide and effectively?
oscaracso
8 hours ago
A large part of geopolitics is concerned with limiting the spread of weapons of mass destruction worldwide and to the greatest possible degree of efficacy. Moreover, the investment to train state-of-the-art models is greater than the Manhattan project and involves larger and more complex supply chains-- it cannot be done clandestinely. Because the scope of the project is large and resource-intensive there are not many bodies that would have to cooperate in order to place impassable obstacles on the path that is presently being taken. 'What if they won't cooperate toward this goal?' -- Worth considering, but the fact is that they can and are choosing not to. If the choice is there it is not an inevitability but a decision.
andrybak
14 hours ago
Do chlorofluorocarbons (CFCs) mostly banned by the Montreal Protocol count?
tbrownaw
9 hours ago
And lead in gasoline, and probably quite a few other things where we found a way to get similar end results with fewer annoying side effects.
gus_massa
15 hours ago
If they go away, it's because they have been replaced by something better(worse) like LLLM or LLMM or whatever.
I'old enough to remember when GAN where going to be used to scam millions of people and flood social media with fake profiles.
latentsea
16 hours ago
What evidence do you need exactly?
I think such statements are likely projections of people's own unwillingness to part with such tools given their own personal perceived utility.
I, for one, wouldn't give up LLMs. Too useful to me personally. So, I will always seek them out.
Alex2037
9 hours ago
your spiritual predecessors campaigned against electricity, radio, audio recordings, TV, computers, video games, CGI, the internet, cellphones, smartphones, and perhaps a myriad other things.
https://upload.wikimedia.org/wikipedia/commons/8/85/The_Unre...
https://www.smithsonianmag.com/history/musicians-wage-war-ag...
etc.
but yes, of course, this time it's going to be different, because unlike those boomers, you and your internet friends are on the right side of history.
>without much hard evidence.
1. China. China has the tech, the talent, and the hardware. you could (you can't), for example, equate LLMs to CSAM in the West to make it absolutely verbotten, but China wouldn't give a shit, and 93% of the world would use Chinese tech, dismissing your dollar store Butlerian Jihad as yet another bout of America's schizophrenia.
2. it's been less than 3 years since ChatGPT release, and it now has 800 million active weekly users. and it's not even available in China and Russia, where Deepseek and other Chinese models easily add another 200-300 million users. no other technology had such explosive proliferation before. good luck convincing all these people who now use it every day to give it up because... because it's bad, mkay?
3. unlike the previous one, the current US administration - which will remain in power for at least three more years - is not hostile to this technology. there will be no regoolations, no moratoriums, and no matter how utterly detached from reality the next administration might end up being, in three years it will be too late to do anything about it (even more so than now).
4. trillion dollar corporations have collectively invested hundreds of billions into this technology. oh, they would love some regulations to hamstring their competitors, but if you try to step on their toes, well, good luck.
5. local models are already good enough to be perpetually useful. what the fuck are you going to do, order door-to-door seizure of fully semi-automatic GPUs?
cyco130
a day ago
LLMs are not people. We can’t blame them.
wvenable
a day ago
What would be the point of blaming LLMs? What would that accomplish? What does it even mean to blame LLMs?
LLMs are not submitting these papers on their own, people are. As far as I'm concerned, whatever blame exists rests on those people and the system that rewards them.
jsrozner
a day ago
Perhaps what is meant is "blame the development of LLMs." We don't "blame guns" for shootings, but certainly with reduced access to guns, shootings would be fewer.
nandomrumber
a day ago
Guns have absolutely nothing to do with access to guns.
Guns are entirely inert objects, devoid of either free will nor volition, they have no rights and no responsibilities.
LLMs likewise.
nsagent
a day ago
To every man is given the key to the gates of heaven. The same key opens the gates of hell.
-Richard Feynmanhttps://www.goodreads.com/quotes/421467-to-every-man-is-give...
xandrius
a day ago
I blame keyboards, without them there wouldn't be these problems.
anonym29
a day ago
This was a problem before LLMs and it would remain a problem if you could magically make all of them disappear.
LLMs are not the root of the problem here.
RobotToaster
a day ago
See Goodhart's law: "When a measure becomes a target, it ceases to be a good measure"
hammock
a day ago
> There is a general problem with rewarding people for the volume of stuff they create, rather than the quality. If you incentivize researchers to publish papers, individuals will find ways to game the system,
I heard someone say something similar about the “homeless industrial complex” on a podcast recently. I think it was San Francisco that pays NGOs funds for homeless aid based on how many homeless people they serve. So the incentive is to keep as many homeless around as possible, for as long as possible.
djeastm
a day ago
I don't really buy it. Are we to believe they go out of their way to keep people homeless? Does the same logic apply to doctors keeping people sick?
ssivark
a day ago
ICYMI, this drew a lot of attention a few years ago.
https://www.cnbc.com/2018/04/11/goldman-asks-is-curing-patie...
SOLAR_FIELDS
11 hours ago
This could literally be an Onion headline
alfalfasprout
a day ago
It's a metric attribution problem. The real metric should be reduction in homeless, for example (though even that can be gamed through bussing them out, etc-- tactics that unfortunately other cities have adopted). But attributing that to a single NGO is tough.
Ditto for views, etc. Really what you care about as eg; youtube is conversions for the products that are advertised. Not impressions. But there's an attribution problem there.
wizzwizz4
12 hours ago
Define the metric as "people helped": then bussing them out to abandon them somewhere else isn't a solution, because the adjudicators can go "yes, you made the number go down, but you did so by decoupling the metric from what it was supposed to measure, so we're not rewarding you for it".
SOLAR_FIELDS
11 hours ago
My spouse works in the homelessness field and the correct metric to follow is number of homeless given housing. It’s the “housing first” approach. Harder to game counting amount of people directly placed into homes - someone is paying rent and maintaining a trackable occupied space that you can verify that the client is actually utilizing - and this approach cannot be gamed by “bus them somewhere else”
What many people don’t realize is just how many normal life hurdles are significantly easier to overcome with a stable housing environment, even if the client is willing and available to work. Employment, for example, has several precursors that you need. Often you need an address. You need an ID. For that you need a birth certificate. To get the birth certificate you need to have the resources and know how to contact the correct agency. All of these things are much harder to achieve without a stable housing environment for the client.
wizzwizz4
7 hours ago
"Number of homeless given housing" is only the correct measure due to the nature of the domain-specific problem. I'm wary of this strategy in general, because the people responsible for deciding how things are accounted for are rarely experts enough to identify sensible domain-specific metrics, so they'll have to consult experts. But that creates a vulnerable point of significant interest to would-be grifters, and if they're not experts enough to assess expert consensus, you end up with metrics that don't work, baked in.
But yes, if we're only looking at homelessness, "how many formerly-homeless people have been given housing?" is a very good way to measure successful interventions.
xhkkffbf
11 hours ago
And then some will wander back closing the loop and preserving jobs.
watwut
17 hours ago
Yeah, it is totally NGO that creates homelessness /s
godelski
a day ago
> rewarding people for the volume ... rather than the quality.
I suspect this is a major part of the appeal of LLMs themselves. They produce lines very fast so it appears as if work is being done fast. But that's very hard to know because number of lines is actually a zero signal in code quality or even a commit. Which it's a bit insane already that we use number of lines and commits as measures in the first place. They're trivial to hack. You even just reward that annoying dude who keeps changing the file so the diff is the entire file and not the 3 lines they edited...I've been thinking we're living in "Goodhart's Hell". Where metric hacking has become the intent. That we've decided metrics are all that matter and are perfectly aligned with our goals.
But hey, who am I to critique. I'm just a math nerd. I don't run a multi trillion dollar business that lays off tons of workers because the current ones are so productive due to AI that they created one of the largest outages in history of their platform (and you don't even know which of the two I'm referencing!). Maybe when I run a multi trillion dollar business I'll have the right to an opinion about data.
slashdave
a day ago
I think you will discover that few organizations use the size or number of edits as a metric of effort. Instead, you might be judged by some measure of productivity (such as resolving issues). Fortunately, language agents are actually useful at coding, when applied judiciously.
godelski
6 hours ago
Yet it's common enough we see. You also bring up a 10x engineer joke. There's two types of 10x engineers: those that do 10x the work and those who solve 10x the jira tickets but are the cause of 100x of them.
The point is that people metric hack and very bureaucratic structures tend to incentivize metric hacking, not dissuade them. See Pournelle's Iron Law of Bureaucracy.
> Fortunately, language agents are actually useful at coding, when applied judiciously.
I'm not sure this is in doubt by anyone. By definition it really must be true. The problem is that they're not being used judiciously but haphazardly. The problem is people in large organizations are more concerned with politics than the product they make.If you cannot see how quality is decreasing then I'm not sure what to tell you. Yes, there are metrics where it's getting better but at the same time user frustration is increasing. AWS and Azure just had recent major outages. Cloudstrike took down lots of the world's network over an avoidable mistake. Microsoft is fumbling the windows upgrade. Apple intelligence was a disaster. YouTube search is beyond infuriating. Google search is so bad we turn to LLMs now. These are major issues and obvious. We don't even have the time to talk about the million minor issues like YouTube captions covering captions embedded in the video, which is not a majorly complicated problem to solve with AI and they're instead pushing AI upscale that is getting a lot of backlash.
So you can claim things are being used judiciously all you want, but I'm not convinced when looking at the results. I'm not happy that every device I use is buggy as shit and simultaneously getting harder to fix myself.
canjobear
7 hours ago
Who is getting rewarded for uploading tons of stuff to the arXiv?
pwlm
a day ago
What would a system that rewards people for quality rather than volume look like?
How would an online world that is optimized for humans, not algorithms, look like?
Should content creators get paid?
pjdesno
11 hours ago
> What would a system that rewards people for quality rather than volume look like?
Hiring and tenure review based on a candidate’s selected 5 best papers.
Already standard practice at a few enlightened places, I think. (of course this also probably increases the review workload for top venues)
To a lesser extent, bean-counting metrics like citations and h-index are an attempt to quantify non-volume-based metrics. (for non-academics, h-index is the largest N such that your N-th most cited paper has >= N citations)
Note that most approaches like this have evolved to counter “salami-slicing”, where you divide your work into “minimum publishable units”. LLMs are a different threat - from my selfish point of view, one of the biggest risks is that it takes less time to write a bogus paper with an LLM than it does for a single reviewer to review it. That threatens to upend the entire peer reviewing process.
drnick1
a day ago
> Should content creators get paid?
I don't think so. Youtube was a better place when it was just amateurs posting random shit.
vladms
a day ago
> Should content creators get paid?
Everybody "creates content" (like me when I take a picture of beautiful sunset).
There is no such thing as "quality". There is quality for me and quality for you. That is part of the problem, we can't just relate to some external, predefined scale. We (the sum of people) are the approximate, chaotic, inefficient scale.
Be my guest to propose a "perfect system", but - just in case there is no such system - we should make sure each of us "rewards" what we find of quality (being people or content creators), and hope it will prevail. Seemed to have worked so far.
MangoToupe
20 hours ago
Crazily, I think the easiest way is to remove any and all incentives, awards, finite funding, and allegedly merit-based positions. Allow anyone who wants to research to research. Natural recognition of peers seems to be the only way to my thinking. Of course this relies on a post-scarcity society so short of actually achieving communism we'll likely never see it happen.
js8
18 hours ago
You don't need postscarcity to do that. I was born in communist Czechoslovakia (my father was an academic). Government allocated jobs for academics and researchers, and they pretty much had tenure. So you could coast by being unproductive, or get by using your connections to the party members (the real currency in CSSR).
After 1989, most academics complained the system is not merit-based and practical (applied) enough. So we changed it to grants and publications metrics (modeled after the West). For a while, it worked.. until people found too much overbearing bureaucracy and some learned how to game the system again.
I would say, both systems have failure modes of a similar magnitude, although the first one is probably less hoops and less stress on each individual. (During communism, academia - if you could get there, especially technical sciences - was an oasis of freedom.)
epolanski
18 hours ago
The prize in science is being cited/quoted, not publishing.
Sure, publishing on important papers has its weight, but not as much as getting cited.
PeterStuer
18 hours ago
That might be the "prize" but the "bar" is most certainly in publish or perisch to work your way up the early academic carreer ladder. Every conference or workshop attendance needs a paper, regardless of wether you had any breakthrough. And early metrics are most often quantity based (at least 4 accepted journal articles), not citation based.
kjkjadksj
a day ago
I think many with this opinion actually misunderstand. Slop will not save your scientific career. Really it is not about papers but securing grant funding by writing compelling proposals, and delivering on the research outlined in these proposals.
porcoda
a day ago
Ideally that is true. I do see the volume-over-quality phenomenon with some early career folks who are trying to expand their CVs. It varies by subfield though. While grant metrics tend to dominate career progression, paper metrics still exist. Plus, it’s super common in those proposals to want to have a bunch of your own papers to cite to argue that you are an expert in the area. That can also drive excess paper production.