Springtime
13 hours ago
Ars Technica being caught using LLMs that hallucinated quotes by the author and then publishing them in their coverage about this is quite ironic here.
Even on a forum where I saw the original article by this author posted someone used an LLM to summarize the piece without having read it fully themselves.
How many levels of outsourcing thinking is occurring to where it becomes a game of telephone.
sho_hn
11 hours ago
Also ironic: When the same professionals advocating "don't look at the code anymore" and "it's just the next level of abstraction" respond with outrage to a journalist giving them an unchecked article.
Read through the comments here and mentally replace "journalist" with "developer" and wonder about the standards and expectations in play.
Food for thought on whether the users who rely on our software might feel similarly.
There's many places to take this line of thinking to, e.g. one argument would be "well, we pay journalists precisely because we expect them to check" or "in engineering we have test-suites and can test deterministically", but I'm not sure if any of them hold up. The "the market pays for the checking" might also be true for developers reviewing AI code at some point, and those test-suites increasingly get vibed and only checked empirically, too.
Super interesting to compare.
armchairhacker
5 hours ago
- There’s a difference. Users don’t see code, only its output. Writing is “the output”.
- A rough equivalent here would be Windows shipping an update that bricks your PC or one of its basic features, which draws plenty of outrage. In both cases, the vendor shipped a critical flaw to production: factual correctness is crucial in journalism, and a quote is one of the worst things to get factually incorrect because it’s so unambiguous (inexcusable) and misrepresents who’s quoted (personal).
I’m 100% ok with journalists using AI as long as their articles are good, which at minimum requires factual correctness and not vacuous. Likewise, I’m 100% ok with developers using AI as long as their programs are good, which at minimum requires decent UX and no major bugs.
fennecbutt
3 hours ago
Tbf I'm fine with it only one way around; if a journalist has tonnes of notes and data on a subject and wants help to condense those down into an article, assistance with prioritising which bits of information to present to the reader then totally fine.
If a journalist has little information and uses an llm to make "something from nothing" that's when I take issue because like, what's the point?
Same thing as when I see managers dumping giant "Let's go team!!! 11" messages splattered with AI emoji diarrhea like sprinkles on brown frosting. I ain't reading that shit; could've been a one liner.
armchairhacker
an hour ago
Another good use of an LLM is to find primary sources.
Even an (unreliable) LLM overview can be useful, as long as you check all facts with real sources, because it can give the framing necessary to understand the subject. For example, asking an LLM to explain some terminology that a source is using.
adamddev1
6 hours ago
Excellent observation. I get so frustrated every time I hear the "we have test-suites and can test deterministically" argument. Have we learned absolutely nothing from the last 40 years of computer science? Testing does not prove the absence of bugs.
Terr_
5 hours ago
Don't worry, the LLM also makes the tests. /s
boothby
10 hours ago
I look forward to a day when the internet is so uniformly fraudulent that we can set it aside and return to the physical plane.
rkomorn
10 hours ago
I don't know if I look forward to it, myself, but yeah: I can imagine a future where in person interactions become preferred again because at least you trust the other person is human. Until that also stops being true, I guess.
hxugufjfjf
8 hours ago
There's a fracking cylon on Discovery!
anonymous908213
7 hours ago
> When the same professionals advocating "don't look at the code anymore" and "it's just the next level of abstraction" respond with outrage to a journalist giving them an unchecked article.
I would expect there is literally zero overlap between the "professionals"[1] who say "don't look at the code" and the ones criticising the "journalists"[2]. The former group tend to be maximalists and would likely cheer on the usage of LLMs to replace the work of the latter group, consequences be damned.
[1] The people that say this are not professional software developers, by the way. I still have not seen a single case of any vibe coder who makes useful software suitable for deployment at scale. If they make money, it is by grifting and acting as an "AI influencer", for instance Yegge shilling his memecoin for hundreds of thousands of dollars before it was rugpulled.
[2] Somebody who prompts an LLM to produce an article and does not even so much as fact-check the quotations it produces can clearly not be described as a journalist, either.
ffsm8
10 hours ago
While I don't subscribe to the idea that you shouldn't look at the code - it's a lot more plausible for devs because you do actually have ways to validate the code without looking at it.
E.g you technically don't need to look at the code if it's frontend code and part of the product is a e2e test which produces a video of the correct/full behavior via playwright or similar.
Same with backend implementations which have instrumentation which expose enough tracing information to determine if the expected modules were encountered etc
I wouldn't want to work with coworkers which actually think that's a good idea though
Pay08
7 hours ago
If you tried this shit in a real engineering principle, you'd end up either homeless or in prison in very short order.
ffsm8
3 hours ago
You might notice that these real engineering jobs also don't have a way to verify the product via tests like that though, which was my point.
And that's ignoring that your statement technically isn't even true, because the engineers actually working in such fields are very few (i.e. designing bridges, airplanes etc).
The majority of them design products where safety isn't nearly as high stakes as that... And they frequently do overspec (wasting money) or underspec (increasing wastage) to boot.
This point has been severely overstated on HN, honestly.
Sorry, but had to get that off my chest.
skydhash
2 hours ago
> You might notice that these real engineering jobs also don't have a way to verify the product via tests though, which was my point.
Are you sure? Simulators and prototypes abound. By the time you’re building the real, it’s more like rehearsal and solving a fe problems instead of every intricacy in the formula.
ChrisMarshallNY
3 hours ago
I’ve been saying the same kind of thing (and I have been far from alone), for years, about dependaholism.
Nothing new here, in software. What is new, is that AI is allowing dependency hell to be experienced by many other vocations.
sphars
11 hours ago
Aurich Lawson (creative director at Ars) posted a comment[0] in response to a thread about what happened, the article has been pulled and they'll follow-up next week.
[0]: https://arstechnica.com/civis/threads/journalistic-standards...
_HMCB_
10 hours ago
It’s funny they say the article “may have” run afoul of their journalistic standards. May have is carrying a lot of weight there.
usefulposter
7 hours ago
Just like in the original thread that was wiped (https://news.ycombinator.com/item?id=47012384), Ars Subscriptors continue to display lack of reading comprehension and jump to defending Condé Nast.
All threads have since been locked:
https://arstechnica.com/civis/threads/journalistic-standards...
https://arstechnica.com/civis/threads/is-there-going-to-be-a...
https://arstechnica.com/civis/threads/um-what-happened-to-th...
bombcar
7 hours ago
Ars Technika has fallen substantially from the heady era of Siracusa macOS reviews.
epistasis
12 hours ago
Yikes I subscribed to them last year on the strength of their reporting in a time where it's hard to find good information.
Printing hallucinated quotes is a huge shock to their credibility, AI or not. Their credibility was already building up after one of their long time contributors, a complete troll of a person that was a poison on their forums, went to prison for either pedophilia or soliciting sex from a minor.
Some serious poor character judgement is going on over there. With all their fantastic reporters I hope the editors explain this carefully.
singpolyma3
11 hours ago
TBF even journalists who interview people for real and take notes routinely quite them saying things they didn't say. The LLMs make it worse, but it's hardly surprising behaviour from them
pmontra
7 hours ago
I knew first hand about a couple of news in my life. Both were reported quite incorrectly. That was well before LLMs. I assume that every news is quite inaccurate, so I read/hear them to get the general gist of what happened, then I research the details if I care about them.
epistasis
10 hours ago
It's surprising behavior to come from Ars Technica. But also when journalists misquote it's through a different phrasing of something that Pepe have actually said, sometimes with different emphasis or eve meaning. But of the people I've known who have been misquoted it's always traceable to something they actually did say.
justinclift
7 hours ago
> Their credibility was already building up ...
Don't you mean diminishing or disappearing instead of building up?
Building up sounds like the exact opposite of what I think you're meaning. ;)
zem
6 hours ago
I think they meant it had taken a huge hit and was in the process of building up again
moomin
an hour ago
Ironically, if you actually know what you’re doing with an LLM, getting a separate process to check the quotations are accurate isn’t even that hard. Not 100% foolproof, because LLM, but way better than the current process of asking ChatGPT to write something for you and then never reading it before publication.
Springtime
an hour ago
The wrinkle in this case is the author blocked AI bots from their site (doesn't seem to be a mere robots.txt exclusion from what I can tell), so if any such bot were trying to do this it may have not been able to read the page to verify, so instead made up the quotes.
This is what the author actually speculated may have occurred with Ars. Clearly something was lacking in the editorial process though that such things weren't human verified either way.
trollbridge
13 hours ago
The amount of effort to click an LLM’s sources is, what, 20 seconds? Was a human in the loop for sourcing that article at all?
phire
12 hours ago
Humans aren't very diligent in the long term. If an LLM does something correctly enough times in a row (or close enough), humans are likely to stop checking its work throughly enough.
This isn't exactly a new problem we do it with any bit of new software/hardware, not just LLMs. We check its work when it's new, and then tend to trust it over time as it proves itself.
But it seems to be hitting us worse with LLMs, as they are less consistent than previous software. And LLM hallucinations are partially dangerous, because they are often plausible enough to pass the sniff test. We just aren't used to handling something this unpredictable.
Waterluvian
12 hours ago
It’s a core part of the job and there’s simply no excuse for complacency.
jatora
12 hours ago
There's not a human alive that isnt complacent in many ways.
emmelaich
12 hours ago
You're being way too easy on a journalist.
pixl97
12 hours ago
The words on the page are just a medium to sell ads. If shit gets ad views then producing shit is part of the job... unless you're the one stepping up to cut the checks.
Marsymars
7 hours ago
Ars also sells ad-free subscriptions.
intended
11 hours ago
This is a first degree expectation of most businesses.
What the OP pointed out is a fact of life.
We do many things to ensure that humans don’t get “routine fatigue”- like pointing at each item before a train leaves the station to ensure you don’t eyes glaze over during your safety check list.
This isn’t an excuse for the behavior. Its more about what the problem is and what a corresponding fix should address.
Waterluvian
31 minutes ago
I agree. The role of an editor is in part to do this train pointing.
I think it slips because the consequences of sloppy journalism aren’t immediately felt. But as we’re witnessing in the U.S., a long decay of journalistic integrity contributes to tremendous harm.
It used to be that to be a “journalist” was a sacred responsibility. A member of the Fourth Estate, who must endeavour to maintain the confidence of the people.
potatoman22
12 hours ago
zahlman
12 hours ago
There's a weird inconsistency among the more pro-AI people that they expect this output to pass as human, but then don't give it the review that an outsourced human would get.
kaibee
11 hours ago
> but then don't give it the review that an outsourced human would get.
Its like seeing a dog play basketball badly. You're too stunned to be like "no don't sign him to <home team>".
vidarh
12 hours ago
The irony is that while from perfect, an LLM-based fact-checking agent is likely to be far more dilligent (but still needs human review as well) by nature of being trivial to ensure it has no memory of having done a long list of them (if you pass e.g. Claude a long list directly in the same context, it is prone to deciding the task is "tedious" and starting to take shortcuts).
But at the same time, doing that makes it even more likely the human in the loop will get sloppy, because there'll be even fewer cases where their input is actually needed.
I'm wondering if you need to start inserting intentional canaries to validate if humans are actually doing sufficiently torough reviews.
prussia
12 hours ago
The kind of people to use LLM to write news article for them tend not to be the people who care about mundane things like reading sources or ensuring what they write has any resemblance to the truth.
adamddev1
6 hours ago
The problem is that the LLM's sources can be LLM generated. I was looking up some health question and tried clicking to see the source for one of the LLMs claim. The source was a blog post that contained an obvious hallucination or false elaboration.
kortilla
12 hours ago
The source would just be the article, which the Ars author used an LLM to avoid reading in the first place.
usefulposter
7 hours ago
Incredible. When Ars pull an article and its comments, they wipe the public XenForo forum thread too, but Scott's post there was archived. Username scottshambaugh:
https://web.archive.org/web/20260213211721/https://arstechni...
>Scott Shambaugh here. None of the quotes you attribute to me in the second half of the article are accurate, and do not exist at the source you link. It appears that they themselves are AI hallucinations. The irony here is fantastic.
Instead of cross-checking the fake quotes against the source material, some proud Ars Subscriptors proceed to defend Condé Nast by accusing Scott of being a bot and/or fake account.
EDIT: Page 2 of the forum thread is archived too. This poster spoke too soon:
>Obviously this is massive breach of trust if true and I will likely end my pro sub if this isnt handled well but to the credit of ARS, having this comment section at all is what allows something like this to surface. So kudos on keeping this chat around.
bombcar
7 hours ago
This is just one of the reasons archiving is so important in the digital era; it's key to keeping people honest.
Imustaskforhelp
4 hours ago
Yes, Wayback machine/archive.org is one of the best websites on the whole world wide web.
asddubs
4 hours ago
I read the forum thread, and most people seem to be critical of ars. One person said scott is a bot, but this read to me as a joke about the situation
vor_
2 hours ago
The comment calling him a bot is sarcasm.
JPKab
2 hours ago
I just wish people would remember how awful and unprofessional and lazy most "journalists" are in 2026.
It's a slop job now.
Ars Technica, a supposedly reputable institution, has no editorial review. No checks. Just a lazy slop cannon journalist prompting an LLM to research and write articles for her.
Ask yourself if you think it's much different at other publications.
0xbadcafebee
6 hours ago
> How many levels of outsourcing thinking is occurring to where it becomes a game of telephone
How do you know quantum physics is real? Or radio waves? Or just health advice? We don't. We outsource our thinking around it to someone we trust, because thinking about everything to its root source would leave us paralyzed.
Most people seem to have never thought about the nature of truth and reality, and AI is giving them a wake-up call. Not to worry though. In 10 years everyone will take all this for granted, the way they take all the rest of the insanity of reality for granted.
DonHopkins
6 hours ago
American citizens are having bad health advice AND PUBLIC HEALTH POLICIES officially shoved down their throats by a man who freely and publicly admits to not being afraid of germs because he snorts cocaine off of toilet seats, appointed by another angry senile old man who recommends injecting disinfectant and shoving an ultraviolet flashlight up your ass to cure COVID. We don't have 10 years left.
Lerc
9 hours ago
Has it been shown or admitted that the quotes were hallucinations, or is it the presumption that all made up content is a hallucination now?
vor_
2 hours ago
Another red flag is that the article used repetitive phrases in an AI-like way:
"...it illustrates exactly the kind of unsupervised output that makes open source maintainers wary."
followed later on by
"[It] illustrates exactly the kind of unsupervised behavior that makes open source maintainers wary of AI contributions in the first place."
Pay08
7 hours ago
You could read the original blog post...
Lerc
5 hours ago
How could that prove hallucinations? It could only possibly prove that they are not. If the quotes are in the original post then they are not hallucinations. If they are not in the post they could be caused by something is not a LLM.
Misquotes and fabricated quotes have existed long before AI, And indeed, long before computers.
DonHopkins
5 hours ago
How could reading the original blog post prove hallucinations??! Now you've moved the goalposts to defending your failure to read the original blog post, by denying it's possible to know anything at all for sure, so why bother reading.
So you STILL have not read the original blog post. Please stop bickering until AFTER you have at least done that bare minimum of trivial due diligence. I'm sorry if it's TL;DR for you to handle, but if that's the case, then TL;DC : Too Long; Don't Comment.
DonHopkins
5 hours ago
You're as bad as the lazy incompetent journalists. Just read the post instead of asking questions and pretending to be skeptical instead of too lazy to read the article this discussion is about.
Then you would be fully aware that the person who the quotes are attributed to has stated very clearly and emphatically that he did not say those things.
Are you implying he is an untrustworthy liar about his own words, when you claim it's impossible to prove they're not hallucinations?
jurgenburgen
10 minutes ago
There is a third option: The journalist who wrote the article made the quotes up without an LLM.
I think calling the incorrect output of an LLM a “hallucination” is too kind on the companies creating these models even if it’s technically accurate. “Being lied to” would be more accurate as a description for how the end user feels.
tempestn
5 hours ago
I think you're missing their point. The question you're replying to is, how do we know that this made up content is a hallucination. Ie., as opposed to being made up by a human. I think it's fairly obvious via Occam's Razor, but still, they're not claiming the quotes could be legit.
DonHopkins
2 hours ago
The point is they keep making excuses for not reading the primary source, and are using performative skepticism as a substitute for basic due diligence.
Vibe Posting without reading the article is as lazy as Vibe Coding without reading the code.
You don’t need a metaphysics seminar to evaluate this. The person being quoted showed up and said the quotes attributed to him are fake and not in the linked source:
https://infosec.exchange/@mttaggart/116065340523529645
>Scott Shambaugh here. None of the quotes you attribute to me in the second half of the article are accurate, and do not exist at the source you link. It appears that they themselves are AI hallucinations. The irony here is fantastic.
So stop retreating into “maybe it was something else” while refusing to read what you’re commenting on. Whether the fabrication came from an LLM or a human is not your get-out-of-reading-free card -- the failure is that fabricated quotes were published and attributed to a real person.
Please don’t comment again until you’ve read the original post and checked the archived Ars piece against the source it claims to quote. If you’re not willing to do that bare minimum, then you’re not being skeptical -- you’re just being lazy on purpose.
giobox
12 hours ago
More than ironic, it's truly outrageous, especially given the site's recent propensity for negativity towards AI. They've been caught red-handed here doing the very things they routinely criticize others for.
The right thing to do would be a mea-culpa style post and explain what went wrong, but I suspect the article will simply remain taken down and Ars will pretend this never happened.
I loved Ars in the early years, but I'd argue since the Conde Nast acquisition in 2008 the site has been a shadow of its former self for a long time, trading on a formerly trusted brand name that recent iterations simply don't live up to anymore.
khannn
12 hours ago
Is there anything like a replacement? The three biggest tech sites that I traditionally love are ArsTechnica, AnandTech(rip), and Phoronix. One is dead man walking mode, the second is ded dead, and the last is still going strong.
I'm basically getting tech news from social media sites now and I don't like that.
gtowey
11 hours ago
In my wildest hopes for a positive future, I hope disenchanted engineers will see things like this as an opportunity to start our own companies founded on ideals of honesty, integrity, and putting people above profits.
I think there are enough of us who are hungry for this, both as creators and consumers. To make goods and services that are truly what people want.
Maybe the AI revolution will spark a backlash that will lead to a new economy with new values. Sustainable business which don't need to squeeze their customers for every last penny of revenue. Which are happy to reinvest their profits into their products and employees.
Maybe.
remh
11 hours ago
I’ve really enjoyed 404media lately
khannn
11 hours ago
I like them too. About the only other contender I see is maybe techcrunch.
Need to set an email address and browser up only for sites that require registration.
bombcar
7 hours ago
ServeTheHome has something akin to the old techy feel, but it has its own specific niche.
antod
12 hours ago
While their audience (and the odd staff member) is overwhelming anti AI in the comments, the site itself overall editorially doesn't seem to be.
jandrewrogers
11 hours ago
Conde Nast are the same people wearing Wired magazine like a skin suit, publishing cringe content that would have brought mortal shame upon the old Wired.
emmelaich
12 hours ago
Outrageous, but more precisely malpractice and unethical to not double check the result.
netsharc
12 hours ago
Probably "one bad apple", soon to be fired, tarred and feathered...
zahlman
12 hours ago
If Kyle Orland is about to be fingered as "one bad apple" that is pretty bad news for Ars.
JumpCrisscross
12 hours ago
“Kyle Orland has been the Senior Gaming Editor at Ars Technica since 2012” [1].
rectang
11 hours ago
There are apparently two authors on the byline and it’s not hard to imagine that one may be more culpable than the other.
You may be fine with damning one or the other before all the facts are known, zahlman, but not all of us are.
sho_hn
11 hours ago
I don't read their comment as implying this. It might in fact hint at the opposite; it's far more likely for the less senior author to get thrown under the bus, regardless of who was lazy.
pmontra
7 hours ago
Scapegoats are scapegoats but in every organization the problems are ultimately caused by their leaders. It's what they request or what they fail to request and what they lack to control.
llbbdd
12 hours ago
Honestly frustrating that Scott chose not to name and shame the authors. Liability is the only thing that's going to stop this kind of ugly shit.
rectang
11 hours ago
There is no need to rush to judgment on the internet instant-gratification timescale. If consequences are coming for journalist or publication, they are inevitable.
We’ll know more in only a couple days — how about we wait that long before administering punishment?
llbbdd
8 hours ago
It's not rushing to judgement, the judgement has been made. They published fraudulent quotes. Bubbling that liability up to Arse Technica is valuable for punishing them too but the journalist is ultimately responsible for what they publish too. There's no reason for any publication to ever hire them again when you can hire ChatGPT to lie for you.
EDIT: And there's no plausible deniability for this like there is for typos, or maligned sources. Nobody typed these quotes out and went "oops, that's not what Scott said". Benj Edwards or Kyle Orland pulled the lever on the bullshit slot machine and attacked someone's integrity with the result.
"In the past, though, the threat of anonymous drive-by character assassination at least required a human to be behind the attack. Now, the potential exists for AI-generated invective to infect your online footprint."
rectang
7 hours ago
We do not yet know just how the story unfolded between the two people listed on the byline. Consider the possibility that one author fabricated the quotes without the knowledge of the other. The sin of inadequate paranoia about a deceptive colleague is not the same weight as the sin of deception.
Now to be clear, that’s a hypothetical and who knows what the actual story is — but whatever it is, it will emerge in mere days. I can wait that long before throwing away two lives, even if you can’t.
> Bubbling that liability up to Arse Technica is valuable for punishing them
Evaluating whether Ars Technica establishes credible accountability mechanisms, such as hiring an Ombud, is at least as important as punishing individuals.
stateofinquiry
2 hours ago
I agree that reserving judgement and separating the roles of individuals from the response of the organization are all critical here. Its not the first time that one of their staff were found to have behaved badly, in the case that jumps to my mind from a few years ago Peter Bright was sentenced to 12 years on sex charges involving a minor1. So, sometimes people do bad things, commit crimes, etc. but this may or may not have much to do with their employer.
Did Ars respond in any way after the conviction of their ex-writer? Better vetting of their hires might have been a response. Apparently there was a record of some questionable opinions held by the ex-writer. I don't know, personally, if any of their policies changed.
The current suspected bad behavior involved the possibility that the journalists were lacking integrity in their jobs. So if this possibility is confirmed I expect to see publicly announced structural changes in the editorial process at Ars Technica if I am to continue to be a subscriber and reader.
1 https://arstechnica.com/civis/threads/ex-ars-writer-sentence...
Edit: Fixed italics issue
llbbdd
5 hours ago
That's what bylines are for, though. Both authors are attributed, and are therefore both responsible. If they didn't both review the article before submitting that's their problem. It's exaggerating to call this throwing away two lives, if all they do for a living is hit the big green button on crap journalism then I'm fine with them re-skilling to something less detrimental.
arduanika
28 minutes ago
I mean, I'm even more frustrated by this in Scott's original post:
> If you are the person who deployed this agent, please reach out. It’s important for us to understand this failure mode, and to that end we need to know what model this was running on and what was in the soul document. I’m not upset and you can contact me anonymously if you’d like.
I can see where he's coming from, and I suppose he's being the bigger man in the situation, but at some point one of these reckless moltbrain kiddies is going to have to pay. Libel and extortion should carry penalties no matter whether you do it directly, or via code that you wrote, or via code that you deployed without reading it.
The 'hit piece' on Scott was pretty minor, so if we want to wait around for a more serious injury that's fine, just as long as we're standing ready to prosecute when (not 'if') it happens.
asddubs
4 hours ago
I mean, he linked the archived article. You're one click away from the information if you really want to know.
neya
11 hours ago
Ars Technica has always trash even before LLMs and is mostly an advertisement hub for the highest bidder