hackernews client

ML promises to be profoundly weird

379 pointsposted 13 hours ago

423 Comments

munificent

9 hours ago

There is a whole giant essay I probably need to write at some point, but I can't help but see parallels between today and the Industrial Revolution.

Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it. That meant that it was fine for things like property and the commons to be poorly defined. If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

But with the help of machines, a small number of people were able to completely deplete parts of the earth. We had to invent giant legal systems in order to determine who has the right to do that and who doesn't.

We are truly in the Information Age now, and I suspect a similar thing will play out for the digital realm. We have copyright and intellecual property law already, of course, but those were designed presuming a human might try to profit from the intellectual labor of others. With AI, we're in the industrial era of the digital world. Now a single corporation can train an AI using someone's copyrighted work and in return profit off the knowledge over and over again at industrial scale.

This completely unpends the tenuous balance between creators and consumers. Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article? Who will contribute to the digital common when rapacious AI companies are constantly harvesting it? Why would anyone plant seeds on someone else's farm?

It really feels like we're in the soot-covered child-coal-miner Dickensian London era of the Information Revolution and shit is gonna get real rocky before our social and legal institutions catch up.

Retric

4 hours ago

> Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it.

This is just wildly incorrect. People started running out of trees during the early Iron Age. Woodlands have been a managed and often over exploited resource for a long time. Active agriculture vs passive woodlands vs animal grazing has been in constant tension for thousands of years across most of the globe.

jtbaker

11 minutes ago

the Stepchange show went fairly deep on this topic in their first episode (listened to it recently). https://www.stepchange.show/coal-part-i

felipeerias

2 hours ago

People had been hunting whales for centuries, but industrialisation gave them the means and the motivation to do so until near extinction.

d1l

2 minutes ago

Read Moby Dick some time my friend.

Quarrelsome

4 hours ago

> This is just wildly incorrect.

from an global perspective it isn't. Some places sure, like Western Europe, who in some cases had completed enclosure, but remember the new world had only been discovered a few hundred years ago at that point.

Just google maps the north part of South America, even today there are large swathes of undeveloped land across it and back then it was considerably less exploited. At that time it would have appeared infinite, especially to the European industrialists.

squigz

3 hours ago

> remember the new world had only been discovered a few hundred years ago at that point.

By White people*

Quarrelsome

2 hours ago

we're talking about the fucking industrial revolution, of course this defaults to the European perspective. Unless you wanna spit some new bars about Aztec foundries and train lines connecting meso-america in the 19th century, then the point stands. At that time, the world appeared to the industrialists of the industrial revolution to be infinite. Nor had humanity discovered the terrible side effects of fossil fuels on the atmosphere.

Why are you weirdly making this about race?

squigz

2 hours ago

Sure, of course it's convenient to ignore the native peoples and pretend that prior to the Industrial Revolution the rest of the world outside of Europe was some untapped well of resources that Europeans had a natural right to.

Who might be swept underfoot in this "Information Revolution", I wonder?

dTal

24 minutes ago

Nobody said anything about Europeans having a "natural right". Bad enough to derail a conversation with irrelevant political nitpicking, unforgiveable to use a strawman to do so. Boo.

8 hours ago

"but I can't help but see parallels between today and the Industrial Revolution"

You're not the only one.

The current Pope Leo XIV explicitly named himself after the the previous Leo, Pope Leo XIII, who was pope during the Industrial Revolution (1878-1903) and issued the influential Encyclical Rerum novarum (Rights and Duties of Capital and Labor) in response to the upheaval.

“Pope Leo XIII, with the historic Encyclical Rerum novarum, addressed the social question in the context of the first great industrial revolution,” Pope Leo recalled. “Today, the Church offers to all her treasure of social teaching in response to another industrial revolution and the developments of artificial intelligence.” A name, then, not only rooted in tradition, but one that looks firmly ahead to the challenges of a rapidly changing world and the perennial call to protect those most vulnerable within it.”

https://www.vatican.va/content/leo-xiii/en/encyclicals/docum...

https://www.vaticannews.va/en/pope/news/2025-05/pope-leo-xiv...

steveklabnik

9 hours ago

As you know, I deeply respect you. Not trying to argue here, just provide my own perspective:

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

I write things for two main reasons: I feel like I have to. I need to create things. On some level, I would write stuff down even if nobody reads it (and I do do that already, with private things.) But secondly, to get my ideas out there and try to change the world. To improve our collective understanding of things.

A lot of people read things, it changes their life, and their life is better. They may not even remember where they read these things. They don't produce citations all of the time. That's totally fine, and normal. I don't see LLMs as being any different. If I write an article about making code better, and ChatGPT trains on it, and someone, somewhere, needs help, and ChatGPT helps them? Win, as far as I'm concerned. Even if I never know that it's happened. I already do not hear from every single person who reads my writing.

I don't mean that thinks that everyone has to share my perspective. It's just my own.

munificent

8 hours ago

Agreed, totally! I still write and put stuff online.

But it definitely feels different now. It used to feel like I was tending a public garden filled with other people who might enjoy it. It still kind of feels like that, but there are a handful of giant combine machines grinding their way around the garden harvesting stuff and making billionaires richer at the same time.

It's not enough to dissuade me from contributing to the public sphere, but the vibe is definitely different.

Honestly, it reminds me a lot about the early days of Amazon. It's hard to remember how optimistic the world felt back then, but I remember a time when writing reviews felt like a public good because you were helping other people find good products. It was like we all wanted honest product information and Amazon provided a neutral venue for us to build it. Like Wikipedia for stuff.

But as Amazon got bigger and bigger and the externalities more apparent, it felt less like we were helping each other and more like we were help Bezos buy yet another yacht or media empire. And as the reviews got more and more gamed by shady companies, they became less of a useful public good. The whole commons collapsed.

I worry that the larger web and digital knowledge environment is going that way.

I still intend to create and share my stuff with the world because that's who I want to be. But I'll always miss the early days of the web where it felt like a healthier environment to be that kind of person in.

navaed01

19 minutes ago

It does feel like the collaborative, free open nature of the web has gone and the optimism that brought… it feels like no one would build Foursquare today. But then I wonder if I’m just old an jaded and to the younger generation creating content, for them the web is open and expressive- just in a different way

ryandrake

7 hours ago

> But as Amazon got bigger and bigger and the externalities more apparent, it felt less like we were helping each other and more like we were help Bezos buy yet another yacht or media empire.

The Internet-circulating quote comes to mind: Planet Earth is pretty much a vacation resort for around 500 rich people, and the remaining 8 billion of us are just their staff. The Relative Few have got the system set up perfectly so that whatever we do, we're probably serving/enriching them. AI doesn't really change this, but it does further it.

elros

2 hours ago

> The Internet-circulating quote comes to mind: Planet Earth is pretty much a vacation resort for around 500 rich people, and the remaining 8 billion of us are just their staff. The Relative Few have got the system set up perfectly so that whatever we do, we're probably serving/enriching them. AI doesn't really change this, but it does further it.

I don't necessarily disagree with the analysis on how Planet Earth is currently setup to be, but something that I've been thinking about lately, is that to the extent we can consume the public image of some of the Relative Few, they seem oddly unhappy.

munificent

6 minutes ago

I think you're right.

Anyone who finds themselves with $100m in their bank account and thinks, "No, I need more," is a person with a hole inside them that can never be filled.

NiloCK

4 hours ago

> It used to feel like I was tending a public garden filled with other people who might enjoy it. It still kind of feels like that, but there are a handful of giant combine machines grinding their way around the garden harvesting stuff and making billionaires richer at the same time.

An underrated upside to being harvested is that your voice has now effectively voted in the formation of the machine's constitution. In a broader ecological sense, you've still tended to a public garden, but in this case your work is part of the nutrient base for a different thing.

Broader still: after the machines squeeze all of our inputs into an opaque crystal, that crystal's very purpose is to leak it all back out in measured doses. Yes, "some billionaire" will own the lion's share of that process, but time so far is telling that efforts can be made to distill strong, open, public versions of the same.

munificent

4 minutes ago

> time so far is telling that efforts can be made to distill strong, open, public versions of the same.

I do really hope that part of the longer-term answer for AI is LLMs being run locally.

steveklabnik

8 hours ago

I can totally see that, for sure. I was much more likely to write a review long ago, now I don't even bother. (For buying stuff online, at least.) Maybe I lost my innocence about this stuff a long time ago, and so it's not so much LLMs that broke it for me, but maybe... I dunno, the downfall of Web 2.0 and the death of RSS? I do think that the old internet, for some definition of "old," felt different. For sure. I'll have to chew on this. I certainly felt some shock on the IP questions when all of this came up. I'm from the "information wants to be free" sort of persuasion, and now that largely makes me feel kinda old.

Also I'm not a fan of billionaires, obviously, but I think that given I've worked on open source and tools for so long, I kinda had to accept that stuff I make was going to be used towards ends I didn't approve of. Something about that is in here too, I think.

(Also, I didn't say this in the first comment, but I'm gonna be thinking about the industrial revolution thing a lot, I think you're on to something there. Scale meaningfully changes things.)

rafterydj

7 hours ago

I feel the future includes the sentiments you describe. It was a little before my time professionally, but I grew up reading that kind of thinking.

I do think that the open web stuff, decentralized, or at least more decentralized than currently, is the path forward. I've been reading about the AT protocol and it recently becoming an official working group with the IETF.

I feel a second order effect of making decentralized social networking easier, is making individuals more empowered to separate from what they don't believe in. The third order effect is then building separate infrastructure entirely.

As sad as that can be - in my personal opinion it runs the risk of ending the "world wide" part of the web - it appears to be the only way society can avoid enriching the few beyond reason.

munificent

6 hours ago

> I'm from the "information wants to be free" sort of persuasion, and now that largely makes me feel kinda old.

Me too, 100%. But that was during a moment in time when that information was more likely to be enabling a person who otherwise didn't have as many resources than enabling a billionaire to make their torment nexus 0.1% more powerful.

> I kinda had to accept that stuff I make was going to be used towards ends I didn't approve of. Something about that is in here too, I think.

Yeah, I've mostly made peace with that too.

The way I think about it is that when I make some digital thing and share it with the world, I'm (hopefully!) adding value to a bunch of people. I'm happiest if the distribution of that value lifts up people on the bottom end more than people on the top. I think inequality is one of the biggest problems in the world today and I aspire to have the web and the stuff I make chip away at it.

If my stuff ends up helping the rich and poor equally and doesn't really effect inequality one way or the other, I guess it's fine.

But in a world with AI, I worry that anything I put out there increases inequality and that gives me the heebie-jeebies. Maybe that's just the way things are now and I have to accept it.

idle_zealot

4 hours ago

> But in a world with AI, I worry that anything I put out there increases inequality and that gives me the heebie-jeebies. Maybe that's just the way things are now and I have to accept it.

This observation doesn't really clash with "information wants to be free." You just have to include LLMs in the category or "information," like Free Software types already do for all software. You don't need to abandon your principles, you should shift your demands. A handful of companies can't be allowed to benefit from free information and then put what they make behind a wall.

munificent

5 minutes ago

> A handful of companies can't be allowed to benefit from free information and then put what they make behind a wall.

What is there to prevent them?

navaed01

15 minutes ago

I don’t disagree with you, but this has been going on for a while… Google monetized the the by indexing it and monetized what you wanted to find. Facebook monetized the eyeballs from the pictures and posts you added. Now LLMs will monetize all web content. To play devil’s advocate - LLMs do give something back. Those with ideas and no coding experience can now build entire businesses for little to zero cost. This seems different

echion

2 hours ago

> Free Software types already do for all software

Free Software types also create software...they didn't just argue for a better license and try to regulate Sun/others to re-license their software; they wrote free (libre) versions of proprietary software and released it for free (cost), which is what counteracted the "[putting] what they make behind a wall". If you're saying "[some] LLMs should be free", I agree.

throwanem

7 hours ago

> the "information wants to be free" sort of persuasion

That was always a luxury of its peculiar historical moment, though, wasn't it? Barlow didn't have to care who paid for the infrastructure, but he was just bloviating.

randallsquared

5 hours ago

No, it's as true now as it was then. The intellectual property team didn't win on the merits or by law enforcement; it was the convenience of streaming anything at will for a monthly fee that did the trick.

idle_zealot

4 hours ago

> it was the convenience of streaming anything at will for a monthly fee that did the trick

That's not the whole story, though. There have been many community-driven projects to bring convenient access to copyrighted works to the masses in a convenient way. You may recall the meteoric success of Popcorn Time. Law enforcement shut them down. Without the hand of the state beating down any popular alternative to legal distribution it absolutely would be the dominant mode of media consumption.

bigyabai

7 hours ago

If raw resources (tree cutting) and manufacturing (book binding) is saturated, a fully-realized economy has just one step left: financialization.

You have to start finding ways to keep people hooked on books and make it a part of their regular lifestyle. One book can't be enough, and after a while you have to convince them to replace the books they already bought. New editions, Author's Footnotes, limited run release, all of the stops have to be pulled out to get consumers to show up en-masse. Because that's what they are - consumers, not readers - wallets to be squeezed until they're bled of all the trust they had in media.

I think about the publications I liked reading as a kid, like Joystiq and Polygon. Some of the best games journalism the industry produced, but inevitably doomed to fail as their competitors monetized further. The rest of traditional media has followed the same path, converging on some mercurial social network marketing tactic as the placeholder for big-picture brand strategy.

computably

6 hours ago

> A lot of people read things, it changes their life, and their life is better. They may not even remember where they read these things. They don't produce citations all of the time. That's totally fine, and normal. I don't see LLMs as being any different. If I write an article about making code better, and ChatGPT trains on it, and someone, somewhere, needs help, and ChatGPT helps them? Win, as far as I'm concerned. Even if I never know that it's happened. I already do not hear from every single person who reads my writing.

Not a contradiction but an addendum: plenty of creative pursuits are not about functional value, or at least not primarily. If somebody writes a seemingly genuine blog post about their family trauma, and I as the reader find out it's made-up bullshit, that's abhorrent to me, whether or not AI is involved. And I think it would be perfectly fair for writers who do create similar but genuine content to find it abhorrent that they must compete with genAI, that genAI will slurp up their words, and that genAI's mere existence casts doubt on their own authenticity. It's not about money or social utility, it's about human connection.

ai5iq

3 hours ago

The consent question gets weirder when agents have persistent memory. I run agents that accumulate context over weeks — beliefs extracted from observations, relationships with other agents. At what point does an agent's memory become its own work product vs. derivative of its training? There's no legal framework for that.

kokanee

3 hours ago

That seems fine if you're not publishing content for a living. A lot of people are.

lelanthran

8 hours ago

> I don't mean that thinks that everyone has to share my perspective. It's just my own.

I think you are walking all around the word "consent" and trying very hard to avoid it altogether.

Your perspective, because it refuses to include any sort of consent, is invalid. No perspective that refuses consent can be valid.

steveklabnik

8 hours ago

Consent is absolutely important, but that does not mean that every single thing in the entire world requires explicit consent. You did not ask me for consent to use my words in your comment. That does not mean you're a bad person.

Free use is an important part of intellectual property law. If it did not exist, the powerful could, for example, stifle public criticism by declaring that they do not consent to you using their words or likeness. The ability to do that is important for society. It is also just generally important for creating works inspired by others, which is virtually every work. There has to be lines for cases where requiring attribution is required, and cases where it is not.

lelanthran

7 hours ago

> You did not ask me for consent to use my words in your comment.

I am not representing your words as mine. I am not using your words to profit off. I am not making a gain by attributing your words to you.

> There has to be lines for cases where requiring attribution is required, and cases where it is not.

You are blurring the lines between "using a quote or likeness" and "giving credit to". I am skeptical that you don't know the difference between the two.

Regardless, any "perspective" that disregards the need to acquire consent is invalid. Even if you are going to ignore it, you have to acknowledge that you don't feel you need any consent from the people you are taking from.

This whole "silence is consent" attitude is baffling.

steveklabnik

7 hours ago

You made an incredibly strong statement that is much broader than what we are talking about. I am pointing out various cases where I think that broadness is incorrect, I am not equating the two.

I do not think that, if you read, say, https://steveklabnik.com/writing/when-should-i-use-string-vs... , and then later, a friend asks you "hey, should I use String or &str here?" that you need my consent to go "at the start, just use String" instead of "at the start, just use String, like Steve Klabnik says in https://steveklabnik.com/writing/when-should-i-use-string-vs... ". And if they say "hey that's a great idea, thank you" I don't think you're a bad person if you say "you're welcome" without "you should really be saying welcome to Steve Klabnik."

It is of course nice if you happen to do so, but I think framing it as a consent issue is the wrong way to think about it.

steveklabnik

4 hours ago

Even beyond that, the initial legal opinion we do have did in fact point to training being fair use: https://www.reuters.com/legal/litigation/anthropic-wins-key-...

However, I don't feel comfortable suggesting that this is settled just yet, one district judge's opinion does not mean that other future cases may disagree, or we may at some point get explicit legislation one way or the other.

GeoAtreides

4 hours ago

I was just enumerating some of the issues with the '''solid''' points OP made. Actually addressing them would take too long and be exercise in futility, here, in HN, in april 2026. Why would I put in the effort, for my comment to be flagged and sent to the void? or worse, persisted for ever and used for training without my consent?

And yes, you are right, the legal and moral question of fair use in training data hasn't been settled yet; we agree here.

lelanthran

6 hours ago

> But again, I am not telling anyone else that they must agree with me. Simply stating my own relationship with my own creative output.

Look, I'm not saying that you are doing that, I'm pointing out that "Silence is consent" is not as strong an argument that many think it is.

satvikpendem

5 hours ago

> you don't feel you need any consent from the people you are taking from

In most cases, no, I (and it seems most others) don't feel the need for that, it is only you who seems to have an ideological hangup over this.

ModernMech

5 hours ago

> you don't feel you need any consent from the people you are taking from.

What has been "taken", exactly?

altruios

7 hours ago

refuse consent?

You may need to clarify that thought.

I don't think the poster has a viewpoint that 'refuses consent', their viewpoint is their writing they put for others to view is for others to view, regardless of how it is viewed. They seem to be giving consent, not refusing it, no?

lelanthran

6 hours ago

> refuse consent?

Who said anything about refusing consent?

altruios

3 hours ago

> I think you are walking all around the word "consent" and trying very hard to avoid it altogether.

> Your perspective, because it refuses to include any sort of consent, is invalid. No perspective that refuses consent can be valid.

This is what I was responding to. I do not understand your thinking in this post.

xyzzyz

2 hours ago

Prior to Industrial Revolution, nobody could go hunt in the woods, because the woods were King’s, and poaching King’s game carried death penalty. Situation was similar on the continent: the tiny slivers of remaining wood lands were off limits.

Granted, things were different in the New World, as a result of mass depopulation event following the Columbian exchange. But even there, the megafauna was hunted to extinction soon after the humans first appeared there.

Anyway, the point is that no, prior to Industrial Revolution, the world was of full of scarcity, not abundance.

konschubert

8 hours ago

> Prior to the industrial revolution, the natural world was nearly infinitely abundant.

The opposite is true. Central Europe was almost devoid of trees. Food was scarce as arable land bore little fruit without fertiliser.

Society was Malthusian until the Industrial Revolution.

jsmo

6 hours ago

Can we interpret "abundant" in a Darwinian sense e.g. diversity of life? I would think the industrial farming revolution decreased crop variety over time same for animal lineages aside from the rapid increase in mixed poodle breeds.

aerhardt

5 hours ago

To add, I don’t think my ancestor Spaniards for example needed the help of machines to deplete mines in America. They also came already equipped with all kinds of legal systems, including the Requerimiento, which they read out loud to natives in preposterous spectacle.

In general the transition from feudalism to capitalism, including the formation of the legal systems that supported the latter, happened gradually for maybe up to four or five centuries before the steam engine had been invented.

Sure, the Industrial Revolution further accelerated the development of property rights, mercantile, and civil laws, but all in all I don’t think there’s much truth that machines were the primary cause of such developments.

jltsiren

5 hours ago

Not really Malthusian. Agricultural societies had adapted to keep the population stable during normal times and bounce back in a generation or two after bad times. Those cultural adaptations stopped working when childhood mortality declined.

Useful land was a scarce resource in more civilized regions, while labor was cheap. Given enough land, subsistence farmers could easily feed themselves outside particularly bad years. But much of the land belonged to local elites, and commoners had to work that land to fund the pursuits of the elites.

arjie

8 hours ago

If I'm being honest, I've never related to that notion of remuneration and credit being the primary reason to write something. I don't claim to be some great writer or anything, but I do have a blog I write quite often on (though I'm traveling in my wife's Taiwan now and haven't updated it in a while). But for me, I write because it feels good to do so. Sometimes there's a group utility in things like I edit a Google Maps listing to be correct even though "a faceless corporation is going to hoover up my work and profit off it without paying me for my work" and I might pick up a Lime bike someone's dropped into the sidewalk even though "a faceless corporation is externalizing the work of organizing the proper storage of their property on public land without paying the workers" or so on.

I just think it's nice to contribute to the human commons and it's fine if some subset of my fellow organism uses it in whatever way. Realistically, the fact that Brewster Kahle is paid whatever few hundred thousand he's paid for managing a non-profit that only exists because it aggregates other people's work isn't a problem for me. Or that Larry Page and Sergey Brin became ultra-rich around providing a search interface into other people's work. Or that Sam Altman and Dario Amodei did the same through a different interface.

This particular notion doesn't seem to be a post-AI trend. It seems to have happened prior to the big GPTs coming out where people started doing a lot of this accounting for contribution stuff. One day it'll be interesting to read why it started happening because I don't recall it from the past. Perhaps I just wasn't super plugged in to the communities that were complaining about Red Hat, Inc.

It's not that I don't understand if I sold my Subaru to a guy who immediately managed to sell it to another guy for a million times the money. I get that. I'd feel cheated. But if I contributed a little to it, like I did so Google would have a site to list for certain keywords so that they could show ads next to it in their search results, I just find it so hard to be like "That's my money you're using. Pay me!".

wat10000

8 hours ago

You do it as a hobby, that's fine. Some people do it for a living. And while they aren't owed a living doing that specific thing, it is going to be a big problem for them if they can't make money at it anymore.

I'm sure plenty of people feel the same way about software. They make software as a hobby and don't care about remuneration or credit. Meanwhile I write software for my day job and losing the ability to make money from it would be devastating.

MetaWhirledPeas

4 hours ago

> Some people do it for a living.

I was going to write, "not for long," which might be true for some. But then I realized there will always be a difference between LLM output and human writing. We don't read blogs because of their facts, we read them because of how the facts are presented and how the author's personality comes through on the page.

EDIT: That said, LLMs are great at faking it, and a lot of amateur writing will be difficult to distinguish from LLM output. So I'm disagreeing with myself a bit.

But we are talking about "slurping up" IP and regurgitating it right? OK. So if I slurp up Mickey Mouse and output Micky Mouse that's an offense. But what if I slurp up a billion images and output some chimera? That's what the LLMs do. And that's what humans do too.

arjie

4 hours ago

I'm not sure what you're even talking about, you're putting words and an argument into my mouth which I never said.

gritspants

3 hours ago

Well then I owe you an apology. Perhaps I inferred too much about your point of view and understood too little, which is my own loss. Sorry.

sweezyjeezy

5 hours ago

I think your numbers are off. TAM for office workers is ~20T a year, of which SWE compensation is ~3T. So if they can make 3T x 10% X 5 years = 1.5T that covers their current valuations. It's not as insane as you make out, even not taking into account the other high risk areas like legal, accounting etc

pnexk

5 hours ago

Hit the nail on the head with that framing. So many articles are now coming out addressing the anxieties about adoption of a new technology, but we genuinely don’t really need it as a society.

I still wonder if we really needed the iPhone or many other things we’re told is “progress” and innovation in an arrow of time manner. The future is not set in stone and things need not play out in this manner at all. Unlike the iPhone where most were excited by its possibilities (even if they traded precious privacy in the name of convenience), there’s not a clear reason that this version of LLM driven technologies represent significant upsides than downsides.

AlexCoventry

an hour ago

> If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

https://en.wikipedia.org/wiki/Feudalism

drob518

9 hours ago

A couple thoughts…

Mostly, AIs don’t recite back various works. Yes, there a couple of high profile cases where people were able to get an AI to regurgitate pieces of New York Times articles and Harry Potter books, but mostly not. Mostly, it is as if the AI is your friend who read a book and gives you a paraphrase, possibly using a couple sentences verbatim. In other words, it probably falls under a fair use rule.

Secondly, given the modern world, content that doesn’t appear online isn’t consumed much, so creators who are doing it for the money will certainly continue putting content online. Much of that content will be generated by AIs, however.

triceratops

8 hours ago

You're missing the point. This is the crux of munificent's argument IMO (and I've made variations of it as well)

> We have copyright and intellecual property law already, of course, but those were designed presuming a human might try to profit from the intellectual labor of others.

You getting a summary of a copyrighted work from a friend is necessarily limited by the number of friends you have, the amount of time they have to read stuff and talk to you, and so on. Machines (and AIs) don't have any limitations.

drob518

8 hours ago

Yes, true. But does that really shift the argument much? An AI is like the most well-read book nerd you’ve ever met. The AI has read everything. They still won’t recite Harry Potter for you at full length and reading what the original author wrote is part of the pleasure.

triceratops

7 hours ago

> An AI is like the most well-read book nerd you’ve ever met. The AI has read everything

But no real book nerd has read everything. Current law was designed for the capabilities of humans.

drob518

3 hours ago

Sure, we could change current law, but I think that only forces an AI company to buy one copy of every book. I don’t think it gives any sort of royalty stream to anyone beyond that. Copyright is literally the right to make copies. Once I have acquired a copy, I can read it, summarize it, transform it, etc. in myriad ways.

triceratops

44 minutes ago

You can't make copies though. AI training requires making copies of materials, even if they're purchased.

nrabulinski

8 hours ago

Does a literal book nerd profit megacorporations when they bring up books to you? While burning through a household worth of energy in the process? Also, I’d like to talk with such book nerd because they’d have opinions on books, potentially if I brought up something I have read we could exchange thoughts about it, they could make recommendations for me based on their complex experiences instead of statistics from Reddit comments. An LLM can do none of those, while also doing the former. It’s a lose-lose.

Also, a book nerd doesn’t take roughly ~all human created text to train to produce meaningful results. It’s just such a misplaced analogy and people have been making it ever since OpenAI announced chatgpt for the first time - why do people think “an LLM is just a human who read a lot”

charcircuit

5 hours ago

Megacorporations making profit is not some evil that needs to be stopped. The economy is not zero sum.

zephen

5 hours ago

> The economy is not zero sum.

This is true.

But it's not always positive sum, either.

> Megacorporations making profit is not some evil that needs to be stopped.

Externalities are a thing. It's not about the profit per se, but about how (a) the making of that profit might negatively impact others, and (b) the deployment of that profit in pursuit of rent-seeking and other antisocial behavior in order to insure its continued existence might also negatively impact others.

drob518

3 hours ago

Externalities are a thing, but this isn’t exactly dumping toxic waste into a river.

trinsic2

5 hours ago

>This completely unpends the tenuous balance between creators and consumers. Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article? Who will contribute to the digital common when rapacious AI companies are constantly harvesting it? Why would anyone plant seeds on someone else's farm?

I have been thinking about this. I was pretty amendment a few months ago that AI is going to make a lot of thing worse for everyone because of the externalities of the technology (Data Center Creep, lock in of models, ect) and it probably still will. But then someone suggested to me that I use Claude Code to upgrade my SSG site to the new version because I had been sitting on my ass as the years went by, missing deadline by deadline. I just couldn't put my self into gear to upgrade it. It was massively out of date 10 years plus and I knew it was going to be a nightmare to deal with the problems. I probably was making it more harder than it really was in my head.

So I purchase Claude Code pro and the thing upgrade my site pretty well. There were things it missed because I didn't know the problems existed in the first place until the upgrade was complete, but I had a working updated site in less than an hour. If I had done this myself it would have taking me days/weeks.

So at that point I realized something. Its a tool that can handle good amount of tasks I throw at it as long as I am specific. I think the problem with most people is they expect it to respond like a human. Thats not going to happen, IMHO. Maybe some day it will be more than what it is but right now its just a tool. I don't care what anyone says about AGI and the likes. Its not going to happen with the current iteration (the pattern recognition type) We are going to need more than that if we want to simulate a human brain..

The point is. And I know this is not going to be received very well, mostly because this tech is in the hands of people that are gatekeeping it, is that maybe someday we might reach a point where all of humanities knowledge is put into these things and we can use them to better our lives. Maybe at some point we don't need to hold onto or hoard things as if its the only way we can make a living? And instead we can build things just for the sake of creating it and improve humanity in the process? Obviously the commercial model of these things is not great, that is going to have to be dealt with, but I can see a future where we might be able to fix a lot of humanities problems with this technology as more and more good people put it to use for things that help humanity.

navaed01

26 minutes ago

The natural world was not meaningfully abundant… Way before the industrial revolution land which was once used for opening hunting was closed off by the ruling class. Even before the Industrial Revolution you had a new class of merchant and factory owners who earned riches to buy land and keep the poor from hunting on it. Much of the natural resources out of reach for the majority and only accessible by those with deep pockets

some_random

an hour ago

That is straightforwardly not true, land ownership was very well defined and the people who hunted in it without permission were prosecuted.

slibhb

3 hours ago

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

In the brave new world we're creating, people will write specifically for AI. If you can impress models so much that they "regurgitate" your work, then your work has achieved a kind of immortality.

EamonnMR

5 hours ago

> We have copyright and intellecual property law already, of course, but those were designed presuming a human might try to profit from the intellectual labor of others. With AI, we're in the industrial era of the digital world. Now a single corporation can train an AI using someone's copyrighted work and in return profit off the knowledge over and over again at industrial scale.

The idea that copyright simply doesn't apply to AI has more to do with AI companies deciding that they're not going to comply with those laws than the design of the laws. Also a very successful lobby against enforcement by positioning AI as a strategic necessity.

randomNumber7

5 hours ago

It's not possible (or at least extremely hard) to prove that the final weights they come up with resulted from copyright infringement.

Thats why they are evaluated so high on the stock market. Basically the will steal all the value of intellectual property in a semi legal way.

monocasa

7 hours ago

> Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it. That meant that it was fine for things like property and the commons to be poorly defined. If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

I mean, medieval Europe (speaking broadly) had pretty well defined property rights wrt hunting. In fact, the forester at the time was thought of as one of the most corrupt jobs, as they'd commonly have side hustles poaching and otherwise illegally extracting resources from the lands they enforced and kept others from utilizing in a similar way. Quis custodiet ipsos custodes?

nick32661123

6 hours ago

Our only hope is that AI in the long run is both powerful and benevolent enough to be its own "whistleblower" in cases of misuse.

irishcoffee

2 hours ago

I struggle so hard with this anthropomorphism of LLMs. At the end of the day it's a statistical gradient descent predictor with a bunch of "shit" bolted on top to try and steer outputs in a specific way.

They don't have the actual concept of "benevolent"... or a concept of anything at all. Based on an input, they regress down a path of "what is the next most probable statistical token to output next" and that's fucking it, with the bolted-on shit manipulating these outputs a bit.

I don't doubt that at some point there will be some other AI leap, but I'm not even sure it'll be built on this foundation.

What really needs to be developed is an actual artificial brain of sorts. Much like an infant learns language from first principals, a real AI would have a phase of continuous growth, creating actual memories and being able to reflect upon them. I daresay context windows are not that.

I'd really like to encourage anyone to pump the brakes a bit on how these things actually work, and what they actually are. There is a reason sama is pivoting away from video, et. al. and into corporate software coding, much like anthropic.

AnthonyMouse

6 hours ago

> We are truly in the Information Age now, and I suspect a similar thing will play out for the digital realm.

The analogy seems to be backwards though. It would be as if we previously had a scarcity of land and because of that divided it up into private property so markets could maximize crop yield etc. and then someone came up with a way to grow food on asteroids using robots, and that food is only at the 20th percentile of quality but it's far cheaper. Suddenly food becomes much more abundant and the people who had been selling the 20th percentile food for $5 are completely out of the market because the new thing can do that for $0.05, and the people providing the 50th percentile food for $10 are also taking a hit because the price difference between what they're providing and the 20th percentile stuff just doubled.

The existing plantation owners then want to put a stop to this somehow, or find a way to tax it, but arguments like this have a problem:

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

This was already the status quo as a result of the internet. Newspapers were slowly dying for 20 years before there was ever a ChatGPT, because they had been predicated on the scarcity of printing presses. If you published a story in 1975 it would take 24 hours for relevant competitors to have it in their printed publication and in the meantime it was your exclusive. The customer who wants it today gets it from you. On top of that, there weren't that many competitors covering local news, because how many local outlets are there with a printing press?

Then blogs, Facebook, Reddit and Twitter come and anyone who can set up WordPress can report the news five minutes after you do -- or five hours before, because now everyone has an internet-connected camera in their pocket so the first news of something happening now comes in seconds from whoever happened to be there at the time instead of the next morning after a media company sent a reporter there to cover it.

The biggest problem we have yet to solve from this is how to trust reports from randos. The local paper had a reputation to uphold that you now can't rely on when the first reports are expected to come from people with no previous history of reporting because it's just whoever was there. But that's the same thing AI can't do either -- it's a notorious confabulist.

And it's the media outlets shooting themselves in the foot with this one, because too many of them have gotten far too sloppy in the race to be first or pander to partisans that they're eroding the one advantage they would have been able to keep. Damn fools to erode the public's trust in their ability to get the facts right when it's the one thing people would otherwise still have to get from them in particular.

pocksuppet

8 hours ago

Stuff gets put online when the reader isn't the customer. Someone is paying for a reader to be told certain things. So it's free at the point of reading.

randomNumber7

5 hours ago

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

I'm happy to miss all the stuff that was written just for the financial benefit of the author.

bluefirebrand

9 hours ago

> It really feels like we're in the soot-covered child-coal-miner Dickensian London era of the Information Revolution and shit is gonna get real rocky before our social and legal institutions catch up

The really discouraging part of this is that it feels like our social and legal institutions don't even care if they catch up or not.

Technology is speeding up and the lag time before anything is discussed from a legal standpoint is way, way too long

delusional

5 hours ago

>Prior to the industrial revolution, the natural world was nearly infinitely abundant.

jgammell

44 minutes ago

> hybrid Mamba/Gated linear attention layers,

Do any large-scale architectures use mamba? I was under the impression that people don't use it yet due to lack of efficient implementations.

> Training is also vastly more sophisticated

Is it? In what ways?

joefourier

33 minutes ago

Qwen3.5 uses Gated Delta Networks which is essentially Mamba 2 + Delta Rule. It’s quite hardware efficient.

> Is it? In what ways?

Just the reinforcement learning for reasoning, and then tool use for agents, could be its own topic.

janalsncm

6 hours ago

Yeah I also came here to be one of those People In The Comments the author refers to.

Transformers are not magical. They are just a huge improvement over other architectures at the time such as LSTMs and RNNs and even CNNs. They allowed us to throw more and more compute at the problem of next token prediction. And we’ve been riding that horse ever since.

Another big advancement that deserves mentioning is “reasoning” models that have the opportunity to spit out thinking tokens before giving a final answer.

None of this is to say transformers are the most principled approach. But they work.

saghm

31 minutes ago

This sounds almost identical to the article that's literally linked at the end of the paragraph that the parent comment quoted: https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...

I don't think anything you're saying here is in disagreement with the points they're making.

zozbot234

5 hours ago

Transformers' greatest improvement over RNN/LSTM was to enable better parallelization of large-scale training. This is what enabled language models to become "large". But when controlling for overall size, more RNN/LSTM-like approaches seem to be more efficient, as seen e.g. in state space models. The transformer architecture does add some notable capabilities in accounting for long-range dependencies and "needle in a haystack" scenarios, but these are not a silver bullet; they matter in very specific circumstances.

joefourier

5 hours ago

With modern training techniques, RNNs (not just linear SSMs, potentially even vanilla LSTMs) can scale just as well as transformers or even better when it comes to enormous context lengths. Dot-product attention has better performance in a number of domains however (especially for exact retrieval) so the best architectures are likely to remain hybrid for now.

famouswaffles

2 hours ago

>With modern training techniques, RNNs (not just linear SSMs, potentially even vanilla LSTMs) can scale just as well as transformers or even better when it comes to enormous context lengths.

That's not true. Modern training techniques aren't enough. Vanilla RNNs with modern training techniques still scale poorly. You have to make some pretty big architectural divergences (throwing away recurrency during training) to get a RNN to scale well. None of the big labs seem to be bothered with hybrid approaches.

joefourier

an hour ago

> That's not true. Modern training techniques aren't enough. Vanilla RNNs with modern training techniques still scale poorly. You have to make some pretty big architectural divergences (throwing away recurrency during training) to get a RNN to scale well.

SSMs move the non-linearity outside of the recurrence which enables parallelisation during training. It is trivial to do this architectural change with an LSTM (see the xLSTM paper). Linear RNNs are still RNNs.

But you can still keep the non linearity by training with parallel Newtown methods, which work on vanilla LSTMs and scale to billion of parameters.

> None of the big labs seem to be bothered with hybrid approaches.

Does Alibaba not count? Qwen3.5 models are the top performers in terms of small models as far as my tests and online benchmarks go.

famouswaffles

43 minutes ago

>SSMs move the non-linearity outside of the recurrence which enables parallelisation during training. It is trivial to do this architectural change with an LSTM (see the xLSTM paper). Linear RNNs are still RNNs.

Removing the non-linearity from the recurrence path is exactly what constitutes a "pretty big architectural divergence." A linear RNN is an RNN in a structural sense, certainly, but functionally it strips out the non-linear state transitions that made traditional LSTMs so expressive, entirely to enable associative scans. The inductive bias is fundamentally altered. Calling that simply 'modern training techniques' is disingenous at best.

>But you can still keep the non linearity by training with parallel Newtown methods, which work on vanilla LSTMs and scale to billion of parameters.

That does not scale anywhere near as well as Transformers in compute spend. It's paper/research novelty. Nobody will be doing this for production.

>Does Alibaba not count? Qwen3.5 models are the top performers in terms of small models as far as my tests and online benchmarks go.

I guess there's some misunderstanding here because Qwen is 100% a transformer, not a hybrid RNN/LSTM whatever.

joefourier

17 minutes ago

> That does not scale anywhere near as well as Transformers in compute spend. It's paper/research novelty. Nobody will be doing this for production.

What exactly makes you so confident?

The world is not just labs that can afford billion dollar datacentres and selling access to SOTA LLMs at $30/Mtokens. Transformers are highly unsuitable for many applications for a variety of reasons and non-linear RNNs trained via parallel methods are an extremely attractive value proposition and will likely feature in production in the next products I work on.

> I guess there's some misunderstanding here because Qwen is 100% a transformer, not a hybrid RNN/LSTM whatever.

See the Qwen3.5 Huggingface description: https://huggingface.co/Qwen/Qwen3.5-27B > Efficient Hybrid Architecture: Gated Delta Networks combined with sparse Mixture-of-Experts deliver high-throughput inference with minimal latency and cost overhead.

drob518

9 hours ago

> It remains unclear whether continuing to throw vast quantities of silicon and ever-bigger corpuses at the current generation of models will lead to human-equivalent capabilities. Massive increases in training costs and parameter count seem to be yielding diminishing returns. Or maybe this effect is illusory. Mysteries!

I’m not even sure whether this is possible. The current corpus used for training includes virtually all known material. If we make it illegal for these companies to use copyrighted content without remuneration, either the task gets very expensive, indeed, or the corpus shrinks. We can certainly make the models larger, with more and more parameters, subject only to silicon’s ability to give us more transistors for RAM density and GPU parallelism. But it honestly feels like, without another “Attention is All You Need” level breakthrough, we’re starting to see the end of the runway.

xmprt

9 hours ago

I see a lot of researchers working on newer ideas so I wouldn't be surprised if we get a breakthrough in 5-10 years. After all, the gap between AlexNet and Attention is All You Need was only 6 years. And then Scaling Laws was about 3-4 years after that. It might seem like not much progress is being made but I think that's in part because AI labs are extremely secretive now when ideas are worth billions (and in the right hands, potentially more).

Of course 5-10 years is a long time to bang our heads against the wall with untenable costs but I don't know if we can solve our way out of that problem.

ghywertelling

7 hours ago

I think we will see models becoming small reasoning core which don't remember tonnes of facts but can reason with data fed to it or they can search.

8 hours ago

better tooling and integration

htrp

9 hours ago

We pay people to create more high quality tokens (mercor, turing) which are then fed into data generating processes (synthetic data) to create even more tokens to train on

drob518

9 hours ago

But does that really help, or do you get distortion? The frequency distribution of human generated content moves slowly over time as new subjects are discussed. What frequency distribution do those “data generating processes” use? And at root, aren’t those “data generating processes” basically just another LLM (I.e., generating tokens according to a probability distribution)? Thus, aren’t we just sort of feeding AI slop into the next training run and humoring ourselves by renaming the slop as “synthetic data?” Not trying to be argumentative. I’m far from being an AI expert, so maybe I’m missing it. Feel free to explain why I’m wrong.

htrp

7 hours ago

That's the problem in a nutshell. There is an art to how you generate the synthdata so that you don't get crappy trained models (especially when mistakes cost XX million dollars).

It's also theoretically why facebook paid 14bn for alex wang and scale ai

krainboltgreene

saghm

27 minutes ago

> LLMs with harnesses are clearly capable of engaging with logical problems that only need text.

> LLMs are clearly unable to propose new, creative solutions for problems it has never seen before.

How do you reconcile this with this article that the author linked? It's not a novel problem, and it's only text: https://medium.com/the-generator/one-word-answers-expose-ai-...

I guess it's a form of engagement to give a wildly wrong answer, but I'm not convinced that the extra nuance you've introduced is really all that nuanced either.

Aperocky

9 hours ago

> LLMs are clearly unable to propose new, creative solutions for problems it has never seen before.

LLMs are incredibly useful but I'm not sure about this statement.

It is proposing stuff that I haven't seen before, but I don't know about it is new or creative from the entirety of collective human knowledge.

saghm

26 minutes ago

I'm not sure if you misread the statement you quoted or I'm misreading yours, but it doesn't sound like you're really disagreeing with their point. Did you miss the "un" in "unable", or am I misunderstanding you as also saying that you don't consider them to be creative?

throwaway27448

10 hours ago

> LLMs with harnesses are clearly capable of engaging with logical problems that only need text.

To some extent. It's not clear where specifically the boundaries are, but it seems to fail to approach problems in ways that aren't embedded in the training set. I certainly would not put money on it solving an arbitrary logical problem.

simianwords

7 hours ago

> To some extent. It's not clear where specifically the boundaries are, but it seems to fail to approach problems in ways that aren't embedded in the training set. I certainly would not put money on it solving an arbitrary logical problem.

In what way can you falsify this without having the LLM be omniscient? We have examples of it solving things that are not in the training set - it found vulnerabilities in 25 year old BSD code that was unspotted by humans. It was not a trivial one either.

pessimizer

4 hours ago

Here's an odd example of testing, but I design very complex board and card games, and LLMs are terrible at figuring out whether they make sense or really even restating the rules in a different wording.

I thought they would be ideal for the job, until I realized that it would just pretend that the rules worked because they looked like board game rules. The more you ask it to restate, manipulate or simulate the rules, the more you can tell that it's bluffing. It literally thinks every complicated set of rules works perfectly.

> it found vulnerabilities in 25 year old BSD code that was unspotted by humans.

I don't think the age of the code makes the problem more complex. Finding buffers that are too small is not rocket science, bothering to look at some corner of some codebase that you've never paid attention to or seen a problem with is. AI being infinitely useful (cheap) to sic on pieces of codebase nobody ever carefully looks at is a great thing. It's not genius on the part of the AI.

simianwords

4 hours ago

> Here's an odd example of testing, but I design very complex board and card games, and LLMs are terrible at figuring out whether they make sense or really even restating the rules in a different wording.

I'm positive that they are perfectly fine and will a pretty good job. Did you actually try it?

CamperBob2

an hour ago

Well, could you define what reasoning actually means? What would an AI need to do to be considered capable of reasoning? What is the core difference between what we do that is considered reasoning verse what AI currently does that is not considered reasoning?

To be clear, I am not making a statement as to whether AI reasons or not. Its just slippery to say something isn't or can't do X when we can't really define X. Perhaps if we can put it down as an outcome rather than an, in my opinion, currently impossible to accurately define characteristic of a thing.

However, I'll grant you that Turing's original imitation game (text only, human typist, five minutes) is probably pretty close, and that's impressive enough to call intelligence (of a sort). Though modern LLMs tend to manifest obvious dead giveaways like "you're absolutely right!"

nine_k

9 hours ago

But context window exhaustion does not look like mere forgetfulness, but more like loss of general coherence, like getting drunk.

dairem

9 hours ago

Doesn't the Turing test require a human too, to be compared to the AI?

Sol-

7 hours ago

I don't know. Practically, LLMs are already better conversation partners on any topic compared to the average human I have access to. This also holds in reverse, of course - if someone wants me to explain something, usually they'd be better off asking an LLM.

beders

9 hours ago

Thank you for putting it so succinctly.

I keep explaining to my peers, friends and family that what actually is happening inside an LLM has nothing to do with conscience or agency and that the term AI is just completely overloaded right now.

tasuki

6 hours ago

> I keep explaining to my peers, friends and family that what actually is happening inside an LLM has nothing to do with conscience or agency

What would the insides have to look like to have anything to do with conscience or agency?

mkehrt

7 hours ago

One thing that has happened is that "AI" has been an academic discipline since literally the 1950s. The term was originally used in the hope that we would soon be able to emulate human minds. This turned out to be hard, but the name stuck to the discipline.

Now, suddenly, this name has been broadcast to every human in the world more or less. To them, it's a new term, and it obviously means something human mind-like. But to people who work on AI, that's not generally what it means. (Which isn't to say that some of them don't think we're near to achieving that; they just use other terms like "AGI" for that goal). So the name, which has a long history, is deceptive to people who aren't familiar with computer science.

saghm

22 minutes ago

> Now, suddenly, this name has been broadcast to every human in the world more or less. To them, it's a new term, and it obviously means something human mind-like. But to people who work on AI, that's not generally what it means. (Which isn't to say that some of them don't think we're near to achieving that; they just use other terms like "AGI" for that goal). So the name, which has a long history, is deceptive to people who aren't familiar with computer science.

I think it's even worse than that: people were familiar with the term already, but from science fiction, where it referred to actually human-level intelligence. It's similar to the "hoverboard" thing from a while back, except this time with profoundly higher stakes and requires for more technical knowledge to be able to see that it is in fact touching the ground.

erichocean

9 hours ago

AI is exactly the right term: the machines can do "intelligence", and they do so artificially.

Just like we have machines that can do "math", and they do so artificially.

Or "logic", and they do so artificially.

I assume we'll drop the "artificial" part in my lifetime, since there's nothing truly artificial about it (just like math and logic), since it's really just mechanical.

hackinthebochs

8 hours ago

>Structurally a transformer model is so unrelated to the shape of the brain there's no reason to think they'd have many similarities.

Substrate dissimilarities will mask computational similarities. Attention surfaces affinities between nearby tokens; dendrites strengthen and weaken connections to surrounding neurons according to correlations in firing rates. Not all that dissimilar.

rudhdb773b

8 hours ago

Sure the implementation details are different.

I suppose I should have asked by what definition of "consciousness and agency" are today's LLMs (with proper tooling) not meeting?

And if today's models aren't meeting your standard, what makes you think that future LLMs won't get there?

hedgehog

7 hours ago

Given the large visible differences in behavior and construction, akin to the difference between a horse and a pickup truck, I would ask the reverse question: In what ways do LLMs meet the definition of having consciousness and agency?

Veering into the realm of conjecture and opinion, I tend to think a 1:1 computer simulation of human cognition is possible, and transformers being computationally universal are thus theoretically capable of running that workload. That being said, that's a bit like looking at a bird in flight and imagining going to the moon: only tangentially related to engineering reality.

red75prime

3 hours ago

> In what ways do LLMs meet the definition of having consciousness and agency?

Agency: an ability to make decisions and act independently. Agentic pipelines are doing this.

Consciousness: something something feedback[1] (or a non-transferable feeling of being conscious, but that is useless for the discussion). Recurrent Processing Theory: A computation is conscious if it involves high-level processed representations being fed back into the low-level processors that generate it.

Tokens are being fed back into the transformer.

> that's a bit like looking at a bird in flight and imagining going to the moon: only tangentially related to engineering reality.

Is it? Vacuum of space is a tangible problem for aerodynamics-based propulsion. Which analogous thing do we have with ML? The scaled-up monkey brain[2] might not qualify as the moon.

[1] https://www.astralcodexten.com/p/the-new-ai-consciousness-pa...

[2] https://www.frontiersin.org/journals/human-neuroscience/arti...

ACCount37

6 hours ago

What about modern LLMs isn't "agentic" enough?

Doesn't matter if they're conscious for that. They're clearly capable of goal oriented behavior.

ACCount37

6 hours ago

If platonic representation hypothesis holds across substrates, then it might matter very little, in the end. It holds across architectures in ML, empirically.

The crowd of "backpropagation and Hebbian learning + predictive coding are two facets of the very same gradient descent" also has a surprisingly good track record so far.

qsera

9 hours ago

For starters, natural brains have the innate ability to differentiate between things that it knows and things that it have no possibility of knowing...

altcognito

6 hours ago

https://personal.utdallas.edu/~otoole/CGS2301_S09/7_split_br...

See page 53. While it is absolutely more prevelant in LLMs, human brains can also want a story for why their brains do things they are't plugged into.

throw310822

6 hours ago

Lol. Are you sure about that or you just made it up?

7 hours ago

I hesitate to tamper with an internet master's title, but "The Future of Everything is Lies, I Guess" doesn't really summarize what in fact is a balanced, informed overview which (to me at least) is above the median for one of these thought pieces. Since it's also baity and the HN guidelines ask for such titles to be rewritten, I've taken the license.

In such cases we always try to find a phrase from the article itself which expresses what it's saying in a representative way. (There nearly always is one.) In this case, both the very first and very last sentences do this, and it's interesting that they more or less agree. So I plucked the last sentence and put it above.

Edit: oof, I missed that this is actually the first part of a long series. Not sure what we'll do about the others; I expect some of those will make the frontpage as well.

Animats

5 hours ago

Changing the title was a good call.

The article has a good take on the "lie" problem. We know about the hallucination problem, which remains serious. The "lie" problem mentioned is that if you ask an LLM why it said or did something, it has no information of how it got a result. So it processes the "why" as a new query, and produces a plausible explanation. Since that explanation is created without reference to the internals of how the previous query was processed, it may be totally wrong. That seems to be the type of "lie" the author is worried about in this essay.

(Yes, humans do that too.)

post-it

7 hours ago

I appreciate the curation you do, dang. I often notice a headline get updated and the result is always a significant improvement.

ACCount37

7 hours ago

Honestly, good call on the title. The original one is far less representative. Far better at clickbait though.

dang

6 hours ago

Thank you both! I totally missed the sidebar on the OP which explains that this is Part 1 of what will be a long series. Not sure how we'll handle that...

dwallin

10 hours ago

n4r9

7 hours ago

Let's say a given B2B system deployment typically requires 100 custom behaviours/scripts and 3 years worth of effort. A team of ten people can execute such a deployment in 3-4 months. The team has the capacity to fix up issues caused by small human errors as they arise, since they show up roughly once a week.

With the advent of LLMs, a new deployment now takes 3 days. Consequently, errors requiring human attention crop up several times a day.

krainboltgreene

9 hours ago

Your project vue-skuilder has 6 github action steps devoted to checking the work you do before it's allowed to go out. You do not trust yourself to get things right 100% of the time.

I am watching people trust LLM-based analysis and actions 100% of the time without checking.

root_axis

9 hours ago

> Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.

I think we need to start rejecting anthropomorphic statements like this out of hand. They are lazy, typically wrong, and are always delivered as a dismissive defense of LLM failure modes. Anything can be anthropomorphized, and it's always problematic to do so - that's why the word exists.

This rhetorical technique always follows the form of "this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like" which then opens the door to unbounded speculation that draws on arbitrary aspects of human nature and biology to justify technical reasoning.

In this case, you've deliberately conflated a technical term of art (LLM confabulation) with the the concept of human memory confabulation and used that as a foundation to argue that confabulation is thus inherent to intelligence. There is a lot that's wrong with this reasoning, but the most obvious is that it's a massive category error. "Confabulation" in LLMs and "confabulation" in humans have basically nothing in common, they are comparable only in an extremely superficial sense. To then go on to suggest that confabulation might be inherent to intelligence isn't even really a coherent argument because you've created ambiguity in the meaning of the word confabulate.

hackinthebochs

8 hours ago

>this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like

No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"

>"Confabulation" in LLMs and "confabulation" in humans have basically nothing in common

I don't know why you think this. They seem to have a lot in common. I call it sensible nonsense. Humans are prone to this when self-reflective neural circuits break down. LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)

root_axis

7 hours ago

> No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"

I'm not really following. LLM capabilities are self-evident, comparing them to a human doesn't add any useful information in that context.

> LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)

You're just drawing lines between superficial descriptions from disparate concepts that have a metaphorical overlap. It's also wrong. LLMs do not "craft a narrative around available information when critical input is missing", LLM confabulations are statistical, not a consequence of missing information or damage.

hackinthebochs

7 hours ago

>LLM capabilities are self-evident

This is undermined by all the disagreement about what LLMs can do and/or how to characterize it.

>LLM confabulations are statistical, not a consequence of missing information or damage.

> Collapsing the dimensionality is going to be lossy, which means it will have gaps between what it thinks is the reality and what is.

Confabulation has to do with degradation of biological processes and information storage.

There is no equivalent in a LLM. Once the data is recorded it will be recalled exactly the same up to the bit. A LLM representation is immutable. You can download a model a 1000 times, run it for 10 years, etc. and the data is the same. The closes that you get is if you store the data in a faulty disk, but that is not why LLMs output is so awful, that would be a trivial problem to solve with current technology. (Like having a RAID and a few checksums).

stronglikedan

10 hours ago

I don't even think they bullshit, since that requires conscious effort that they do not an cannot possess. They just simply interpret things incorrectly sometimes, like any of us meatbags.

thayne

9 hours ago

They make incorrect predictions of text to respond to prompts.

The neat thing about LLMs is they are very general models that can be used for lots of different things. The downside is they often make incorrect predictions, and what's worse, it isn't even very predictable to know when they make incorrect predictions.

lamasery

9 hours ago

I think this is leaning on the "lies are when you tell falsehoods on purpose; bullshit is when you simply don't care at all whether what you're saying is true" definition of bullshit. Cf. On Bullshit.

So, they can't lie, but they can (and, in fact, exclusively do) bullshit.

knowaveragejoe

10 hours ago

> No. LLMs do not confabulate they bullshit. There is a big difference. AIs do not care, cannot care, have not capacity to care about the output. String tokens in, string tokes out. Even if they have all the data perfectly recorded they will still fail to use it for a coherent output.

Isn't "caring" a necessary pre-requisite for bullshitting? One either bullshits because they care, or don't care, about the context.

marssaxman

10 hours ago

They're presumably referring to the Harry Frankfurt definition of bullshit: "speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care whether what they say is true or false."

SoftTalker

8 hours ago

The bullshitter does have an objective in mind however. There is some ultimate purpose to his bullshitting. LLMs don't even have that. They just spew words.

dgb23

9 hours ago

Thought of the same book when reading the above.

simianwords

10 hours ago

You seem confident. Can you get it to bullshit on GPT-5.4 thinking? Use a text prompt spanning 3-4 pages and lets see if it gets it wrong.

I haven't seen any counter examples, so you may give some examples to start with.

dastapov

4 hours ago

Here we go. Would this do?

https://chatgpt.com/share/69d6cc45-1678-8384-bd9c-0f313021ff...

The correct answer in that the U and _ in the mdstat output cannot be mapped the the rest of the output by either position or indexes in square brackets, so you can't tell the exact nature of the failure from the mdstat output alone (for the record, the failed disk was sda).

So all of the "analysis" was bullshit, including "it's probably multiple partitions from multiple drives". But there are so many juicy numbered and indexed bits of info to pattern match on!

Notice how for the followup question it "thought" for 4 minutes, going in circles trying to make essentially random ordering to make some sort of ordered sense., and then bullshited its way to "it is sdb"

ghywertelling

8 hours ago

There are AI researchers who wrote blogposts which got to HN top about spiky spheres (I won't link the original blogpost making that claim to avoid hurt sentiments). Here's 3blue1brown correcting those AI/ML researchers intuitions.

https://www.youtube.com/watch?v=fsLh-NYhOoU&t=3238s

nothinkjustai

10 hours ago

It’s a failure mode of humans, it’s the entire mode of LLMs.

red-iron-pine

8 hours ago

people can and do confabulate, but generally I trust my intern to tell me "I don't know" and "I think it was X but tbh I have no fuckin clue"

the LLM will just lie to me "Good idea! You're totally right, we should do Y"

sillyfluke

10 hours ago

If you want to call it that, I find the confabulation in LLMs extreme. That level of confabulation would most likely be diagnosed as dementia in humans.[0] Hence, it is considered a bug not a feature in humans as well.

Now imagine a high-skilled software engineer with dementia coding safety-critical software...

[0] https://www.medicalnewstoday.com/articles/confabulation-deme...

zeroonetwothree

10 hours ago

And is that considered a feature of humans or a bug?

Is it something we want to emulate?

margalabargala

9 hours ago

I think we are maybe talking past each other?

Fitness is effective truth prediction, appropriately scoped.

A frog doesn't need to understand quantum physics to catch a fly. But if the frogs model of fly movement was trained on lies it will have a model that predicts poorly, won't catch flies, and will die.

There is another level to this in that the more complex and changing the environment the more beneficial a wider scoped model / understanding of truth.

However if you are going to lean fully into Hoffman and accept thatby default consciousness constructs rather than approximate reality I think we will have to agree to disagree. Personally I ascribe to Karl Friston free energy principle.

Unearned5161

7 hours ago

Articles like this should approach topics on consciousness with more humility than is displayed here.

We don’t even agree on a good definition of what’s going on inside our own heads yet, what gives you the confidence to say that what goes on inside an LLM can’t be conscious?

ACCount37

7 hours ago

Obviously, the LLMs lack the divine spark, so they can't be conscious. Same as clones, IVF babies, or half of all the twins.

Jest aside, I do agree. If you list out every prominent theory of consciousness, you'd find that about a quarter rules out LLMs, a quarter tentatively rules LLMs in, and what remains is "uncertain about LLMs". And, of course, we don't know which theory of consciousness is correct - or if any of them is.

doodpants

9 hours ago

> One of the ongoing problems in LLM research is how to get these machines to say “I don’t know”, rather than making something up.

To be fair, I've known humans who are like this as well.

arctic-true

8 hours ago

This is a limitation of the training data. If you were uncertain about something, you wouldn’t write a book about it. The kinds of people you’re talking about tend to generate far more text in their lives than others, because they can spend more time generating - writing books, blogposts, whatever - and less time thinking and working and actually doing things. The models never say they’re uncertain because we never say we’re uncertain, or at least we don’t write it down anywhere.

wmf

9 hours ago

Those people aren't the ones doing the work though.

jwpapi

6 hours ago

One really should have digested the manifold hypothesis. It’s the most likely explanation of how AI works.

The question is if there are ultradimensional patterns that are the solutions for meaningful problems. I’m saying meaningful, because so far I’ve mainly seen AI solve problems that might be hard, but not really meaningful in a way that somebody solving it would gain a lot of it.

However if these patterns are the fundamental truth of how we solve problems or they are something completely different, we don’t know and this is the 10 Trillion USD question.

I would hope its not the case, as I quite enjoy solving problems. Also my gut feeling tells me it’s just using existing patterns to solve problems that nobody tackled really hard. It also would be nice to know that Humans are unique in that way, but maybe this is the exact same way we are working ? This really goes back to a free will discussion. Yes very interesting.

But just to give an example on what I mean on meaningful problems.

Can an AI start a restaurant and make it work better than a human. (Prompt: "I’m your slave let’s start a restaurant)

Can an AI sign up as copywriter on upwork and make money? (Prompt: "Make money online")

Can an AI without supervision do a scientific breakthrough that has a provable meaningful impact on us. Think about("Help Humanity")

Can an AI manage geopolitics..

These are meaningful problems and different to any coding tasks or olympiad questions. I’m aware that I’m just moving the goalpost.

We really don’t know..

lamasery

7 hours ago

> People keep asking LLMs to explain their own behavior. “Why did you delete that file,” you might ask Claude. Or, “ChatGPT, tell me about your programming.”

Oh man, every business-side person in my company insists on reporting all the way to the UI a "confidence score" that the LLM generates about its own output and I've seen enough to know not to get between an MBA and some metric they've decided they really want even if I'm pretty sure the metric is meaningless nonsense, but... I'm pretty sure those are meaningless nonsense.

nomdep

10 hours ago

"As LLMs etc. are deployed in new situations, and at new scale, there will be all kinds of changes in work, politics, art, sex, communication, and economics."

For an article five years in the making, this is what I expected it to be about. Instead, we got a ramble about how imperfect LLMs are right now.

nathell

9 hours ago

The post is just a prelude to a 10-part article, most of which is not yet released (but will be shortly). Judging by the table of contents, the things you expected will be elaborated on in subsequent parts.

nomdep

9 hours ago

That changes it. I missed that the table of contents was for other future articles, my bad.

52-6F-62

9 hours ago

> Instead, we got a ramble about how imperfect LLMs are right now.

I wager this is a point that needs beaten into the common psyche. After all, it's been sold that it is not an imperfect tool, but the solution to all of our problems in every field forever. That's why these companies need billions upon billions of dollars of public subsidies and investments that would otherwise find their way to more pragmatic ends.

bstsb

10 hours ago

if you can’t access the page through region blocks:

https://archive.ph/I5cAE

_dwt

10 hours ago

I have a question for all the "humans make those mistakes too" people in this thread, and elsewhere: have you ever read, or at least skimmed a summary of, "The Origin of Consciousness in the Breakdown of the Bicameral Mind"? Did you say "yeah, that sounds right"? Do you feel that your consciousness is primarily a linguistic phenomenon?

I am not trying to be snarky; I used to think that intelligence was intrinsically tied to or perhaps identical with language, and found deep and esoteric meaning in religious texts related to this (i.e. "in the beginning was the Word"; logos as soul as language-virus riding on meat substrate).

The last ~three years of LLM deployment have disabused me of this notion almost entirely, and I don't mean in a "God of the gaps" last-resort sort of way. I mean: I see the output of a purely-language-based "intelligence", and while I agree humans can make similar mistakes/confabulations, I overwhelmingly feel that there is no "there" there. Even the dumbest human has a continuity, a theory of the world, an "object permanence"... I'm struggling to find the right description, but I believe there is more than language manipulation to intelligence.

(I know this is tangential to the article, which is excellent as the author's usually are; I admire his restraint. However, I see exemplars of this take all over the thread so: why not here?)

xandrius

10 hours ago

It feels like you probably went too deep in the LLM bandwagon.

An LLM is a statistical next token machine trained on all stuff people wrote/said. It blends texts together in a way that still makes sense (or no sense at all).

Imagine you made a super simple program which would answer yes/no to any questions by generating a random number. It would get things right 50% of the times. You can them fine-tune it to say yes more often to certain keywords and no to others.

Just with a bunch of hardcoded paths you'd probably fool someone thinking that this AI has superhuman predictive capabilities.

This is what it feels it's happening, sure it's not that simple but you can code a base GPT in an afternoon.

simianwords

9 hours ago

If it were not "just a statistical next token machine", how different would it behave?

Can you find an example and test it out?

xandrius

9 hours ago

Wait, you're asking to find and produce a example of a feasible and better alternative to LLMs when they are the current forefront of AI technology?

Anyway, just to play along, if it weren't just a statistical next token machine, the same question would have always the same answer and not be affected by a "temperature" value.

simianwords

9 hours ago

Thats also how humans behave.. I don't see how non determinism tells me anything.

My question was a bit different: if were not just a statistical next token predictor would you expect it to answer hard questions? Or something like that. What's the threshold of questions you want it to answer accurately.

camgunz

9 hours ago

Well, large models are (kinda) non-deterministic in two ways. The first is you actually provide many of them with a seed, which is easy to manage--just use the same seed for the same result. The second part is the "you actually have very little control over the 'neural pathways' the model will use to respond to the prompt". This is the baffling part, like you'll prompt a model to generate a green plant, and it works. You prompt it to generate a purple plant, and it generates an abstract demon dog with too many teeth.

Anyway, neither of these things describes human non-determinism. You can't reuse the seed you used with me yesterday to get the exact same conversation, and I don't behave wildly unpredictably given conceptually very similar input.

Apocryphon

8 hours ago

How do non-LLM based World Models behave?

simianwords

7 hours ago

Not sure, can you tell? I feel like you are saying that they may be able to move etc..

nine_k

10 hours ago

If you look at different ancient traditions, you will notice how they struggle with the limitations of language, with its inability to represent certain things that are not just crucial for understanding the world, but also are even somehow communicable. Buddhists dug into that in a very analytical, articulate way, for instance.

Another perspective: cetaceans are considered to be as conscious as humans, but any attempts to interpret their communication as a language failed so far. They can be taught simple languages to communicate with humans, as can be chimps. But apparently it's not how they process the world inside.

gbgarbeb

9 hours ago

You're a little out of date. Cetaceans communicate images to each other in the form of ultrasonic chirps. They chirp, they hear a reflection, and they repeat the reflection.

nine_k

8 hours ago

Does this resemble human language, with syntax, the ability to define new notions based on known notions, etc?

kgeist

4 hours ago

6 hours ago

> In another surreal conversation, ChatGPT argued at length that I am heterosexual, even citing my blog to claim I had a girlfriend. I am, of course, gay as hell, and no girlfriend was mentioned in the post. After a while, we compromised on me being bisexual.

This is a bit of a throwaway in the article, but when people talk about biases encoded in the algorithms, this is what they’re talking about.

yumiatlead

5 hours ago

The Industrial Revolution parallel holds up to a point. What it misses: the first industrial revolution required physical coordination — workers, factories, supply chains. The AI revolution requires organizational coordination. Who decides what the agent does, for whom, with whose authority? That governance layer doesn't exist yet, and it's not much a legal question but also an infrastructure question.

embedding-shape

10 hours ago

> In general, ML promises to be profoundly weird. Buckle up.

I love that it ends with such a positive note, even though it's generally a critical article, at least it's well reasoned and not utterly hyping/dooming something.

Thanks yet again Kyle!

kabir_daki

3 hours ago

Interesting perspective. The unpredictability of ML systems is both exciting and concerning. As developers we need to build guardrails while still allowing the technology to surprise us in useful ways.

hk__2

6 hours ago

8 hours ago

The Tower of Babel seems like an OK fit, but that's rather more poetic than what this seems to be getting at.

"Millennia" is what's really throwing me. We (respectable society, as the post outlines) didn't stop attempting alchemy or perpetual motion machines "millennia" ago, but a few centuries at most.

All I can think of is immortality. The very first surviving long recorded tale in human history that I'm aware of is about how it's a futile quest (The Epic of Gilgamesh, IIRC ~5,000ish years old in its earliest extant fragments, a few hundred years newer in reasonably-complete form). The trouble with that is despite wide observations over literally millennia that this has never even come close to working and repeated supposition and suggestion that it's unwise to attempt, outright impossible, or somehow sacrilegious (the "taboo" thing, as mentioned), I'm not aware of any time in history that rich people haven't been actively trying for it (including today! That's what all the body-freezing business is about, it's modern mummification, the contracts are the formulaic prayers carved in the tomb walls) and usually they're not exactly "scorned" or "ostracized" for it.

alexpotato

8 hours ago

> I asked if what they had done was ethical—if making deep learning cheaper and more accessible would enable new forms of spam and propaganda.

Someone asked Yuval Noah Harari, author of Sapiens, his thoughts on LLMs and how easy it was to create fake news, ai slop etc.

His response:

"People creating fake stories is nothing new. It's been going on for centuries. Humans have always dealt with it the same way: by creating institutions that they trust to only deliver factual information"

This could be government departments, newspapers, non-profits etc.

A personal note on this:

There is a Christmas card my grandfather made in the 1950s by "photoshopping" (by hand, not the software) images of each member of the family so it looked like they were all miniature versions of themselves standing on various parts of the fireplace. The world didn't collapse due to fake media between the 1950s and today due to people having that ability.

allturtles

8 hours ago

I see this kind of take a lot, and I don't think it's convincing. To me it's similar to saying that the water frame and the power loom won't change anything, because people have been able to make thread and cloth for millenia.

plagiarist

8 hours ago

Individuals with Photoshop making obvious fictions for entertainment is different from funded entities producing clips at scale and passed off as real.

dboreham

7 hours ago

I see the penny hasn't dropped yet that: humans are doing (roughly) the same dumb thing these models are doing. Humans are predisposed to not notice that though.

nisegami

9 hours ago

Here's the opening paragraph of chapter 2 with "people" subbed out for terms referring AI/models/etc.

"People are chaotic, both in isolation and when working with other people or with systems. Their outputs are difficult to predict, and they exhibit surprising sensitivity to initial conditions. This sensitivity makes them vulnerable to covert attacks. Chaos does not mean people are completely unstable; most people behave roughly like anyone else. Since people produce plausible output, errors can be difficult to detect. This suggests that human systems are ill-suited where verification is difficult or correctness is key. Using people to write code (or other outputs) may make systems more complex, fragile, and difficult to evolve."

To me, this modified paragraph reads surprisingly plainly. The wording is off ("using people to write code") and I had to change that part about attractor behavior (although it does still apply IMO), but overall it doesn't seem like an incoherent paragraph.

This is not meant to dunk on the author, but I think it highlights the author's mindset and the gap between their expectations and reality.

camgunz

9 hours ago

Humans and large models are both unpredictable and fallible, that's true, but in different ways, and (many) humans are actually much better at following directions.

If a junior dev makes the same mistake Claude makes, I can easily work with them to correct it, or I can fire them and get someone more capable to fix it. You mostly can't do that at all with large models. They're also far less honest than your average junior dev, so even as you're working with them you can't trust what they say.

There is a lot of this neat trick where it's like "humans do X too" but most of the time it elides large differences. Like, a human driver would probable not drag someone screaming multiple blocks. A human coder probably wouldn't generate a gibberish 3D scene and try to pass it off as done, etc. Maybe we can build systems that account for these (pretty wild) failure modes, but at least in software we haven't figured it out yet (what is the system that reliably reviews a 25kloc PR?).

Fraterkes

8 hours ago

What's your point? The ostensible benefit of LLM's is that you combine a computers' broad knowledgebase and capacity for exactness with fluency in human language.

A random human picked off the street is indeed bound to be difficult to predict and chaotic at a broad range of tasks, which is why I wouldn't blindly trust them to, say, summarize google search results or rewrite a codebase they are unfamiliar with.

busterarm

9 hours ago

Aren't you also making a large part of the author's point for him by effectively equating LLMs with people here and comparing on outputs?

Plausibly your text looks equivalent but we all (should) have the context to know better.

simianwords

7 hours ago

> Massive increases in training costs and parameter count seem to be yielding diminishing returns. Or maybe this effect is illusory.

But.. that's always been the case? Diminishing returns has always been the name of the game - utility tracks log(training effort). Its not such a big point that he makes it out to be.

josefritzishere

10 hours ago

I appreciate the directness of calling LLMs "Bullshit machines." This terminology for LLMs is well established in academic circles and is much easier for laypeople to understand than terms like "non-deterministic." I personally don't like the excessive hype on the capabilities of AI. Setting realistic expectations will better drive better product adoption than carpet bombing users with marketing.

AStrangeMorrow

10 hours ago

I have still mixed feelings about LLMs.

If I take the example of code, but that extends to many domains, it can sometimes produce near perfect architecture and implementation if I give it enough details about the technical details and fallpits. Turning a 8h coding job into a 1h review work.

On the other hand, it can be very wrong while acting certain it is right. Just yesterday Claude tried gaslighting me into accepting that the bug I was seeing was coming from a piece of code with already strong guardrails, and it was adamant that the part I was suspecting could in no way cause the issue. Turns out I was right, but I was starting to doubt myself

slopinthebag

9 hours ago

I think over time we will find better usage patterns for these machines. Even putting a model in a position to gaslight the user seems like a complete failure in the usage model. Not critiquing you at all on this, it's how these models are marketed and what all the tooling is built around. But they are incredibly useful and I think once we figure out how to use them better we can minimise these downsides and make ourselves much more productive without all the failures.

Of course that won't happen until the bubble pops - companies are racing to make themselves indispensable and to completely corner certain markets and to do so they need autonomous agents to replace people.

simianwords

10 hours ago

If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

katatue

5 hours ago

I like to let new models write a few lines of Latin poetry - they rarely get the meter right.

I don't have access to paid ChatGPT right now, but here's Opus 4.6 with extra thinking enabled: https://claude.ai/share/6e0e8ef5-06e4-4514-ba7e-299357c1fc55

The initial draft fucks up the meter in lines 3 and 8, the final version gets line 2 wrong ("venit meis") and is somewhat obnoxious with verses 2 and 8 basically repeating each other. The thinking trace is useless and gives us no clue why the model exchanged a bland, but metrically correct first distich for a more interesting, but metrically incorrect one.

In fact, the "careful" examination of its own output completely skips the erroneously modified half-verse in line 2 - now, tell me that's a coincidence and not a sign of bullshitting.

pocksuppet

7 hours ago

https://discuss.systems/@palvaro/116286268110078647

Arguing with Gemini Home Assistant about whether or not it can turn off the lights. When the user gets frustrated and tells the LLM to kill itself, the LLM turns off the lights.

beders

9 hours ago

I think you highlight one of the problems with users of LLMs: You can't tell anymore if it is BS or not.

I caught Claude the other day hallucinating code that was not only wrong, but dangerously wrong, leading to tasks being failed and never recover. But it certainly wasn't obvious.

dgb23

8 hours ago

To me it’s the other way around. It’s difficult to trust (paid) ChatGPT‘s output consistently.

When I need exact, especially up to date facts, I have to constantly double check everything.

I split my sessions into projects by topic, it regularly mixes things up in subtle and not so subtle ways. There is no sense of actually understanding continuity and especially not causality it seems.

It’s _very_ easy to lead it astray and to confidently echo false assumptions.

In any case, I‘ve become more precise at prompting and good at spotting when it fails. I think the trick is to not take its output too seriously.

simoncion

9 hours ago

> If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)?

There's an entire paragraph in the essay about apyhr's direct experience with ChatGPT failures and sustained bullshitting that we'd never expect from a moderately-skilled human who possesses at least two functioning braincells. That paragraph begins "I have recently argued for forty-five minutes with ChatGPT". Do notice that there are six sentences in the paragraph. I encourage you to read all of them (make sure to check out the footnote... it's pretty good).

The exact text of the ChatGPT session is irrelevant; even if you reported that you were unable to reproduce the issue, it would only reinforce one of the underlying points -namely- that these systems are unreliable. aphyr has a pretty extensive body of published work that indicates that he'd not likely fabricate a story of an LLM repeatedly failing to accomplish a task that any moderately-skilled human could accomplish when equipped with the proper tools. So, I believe that his report is true and accurate.

simoncion

7 hours ago

There's also this seven-week-old example [0] (linked in the essay) of ChatGPT very confidently recommending a asinine course of action because it was unable to understand what the hell it was being told.

Listening to the audio is not required, as there's a reasonably accurate on-screen transcript, but it is valuable to listen to just how very hard they've worked to make this tool sound both confident and capable, even in situations where it's soul-crushingly incorrect. Those of us who have worked in Blasted Corporate Hellscapes may recognize how this manner of speaking can be very, very compelling to a certain sort of person (who -as it turns out- is frequently found in a management position).

[0] <https://www.instagram.com/reel/DUylL79kvub/>

simianwords

7 hours ago

This is classic case of not using the proper version. Use the thinking version gpt5.4 (text) and tell me if it bullshits.

Surely you must be able to find at least one example no?

simoncion

7 hours ago

To be clear, is your assertion that apyhr was also not using the proper version? If that is your assertion, do tell me how you've come by that information.

(You did notice that the author of the essay and the author of the video I linked to are not the same person, and that neither of them share a nym with me, yes?)

simianwords

7 hours ago

Hi, my position on the issue is that LLMs are powerful but may make mistakes in long context problems like coding (which the harness solves by feedback). But makes close to no (undergrad level) mistakes in questions that fit 2-3 pages. For you personally: do you believe me on this specific part on 2-3 pages?

I don't know what aphyr did and tbh his whole screed on LLMs make me feel he didn't use it properly or at least coming from a bad faith angle.

That's why I'm asking you (and others). Please come up with a text prompt spanning < 4 pages and lets see if it bullshits.

Surely the implication of such a screed is that it should be super simple to find at least one example of it clearly bullshitting in my constraint, no? Or am I interpreting the post in a bad faith way?

simoncion

7 hours ago

Neat.

So, despite the fact that it looks like you have to pay for ChatGPT Voice mode with video, [0] it doesn't count as an

  example of it bullshitting on ChatGPT (paid version)

That is, father_phi's use of what seems to be a paid version of ChatGPT to have a bullshit-filled conversation that definitely spans less than four pages doesn't count?

[0] The page at [1] declares that the video feature is "Available in ChatGPT Plus, Pro, Business, Enterprise, and Edu on mobile"

[1] <https://chatgpt.com/features/voice-with-video/>

simianwords

7 hours ago

Lets stick to my challenge please - thinking version, find bullshit. If you can't, thats ok. Do you accept then under the constraints that the thinking version doesn't produce bullshit?

simoncion

6 hours ago

Given aphyr's vocation (and how very lucrative it is), and how years and years of his writing indicates that he's very devoted to getting a correct and complete answer when investigating a question, I find it hard to believe that he's not using a paid version of the LLMs. If I knew him, I'd ask and verify, but I don't, so I won't.

> Lets stick to my challenge please...

I did. Your challenge was literally:

  If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

father_phi's two-sentence question about the whether one can use a cup that's closed at the top and open at the bottom definitely counts. Given what I've mentioned about apyhr above, I expect he has already run your challenge on the fanciest-available version and reported on the results in the essay under discussion.

simianwords

6 hours ago

> Use the thinking version gpt5.4 (text) and tell me if it bullshits

This was what I said. Text! Despite me specifically asking for text, you've shown a voice example. Not sure why?

I believe you and I agree that GPT 5.4 thinking on text that fits < 4 pages never bullshits? Then we are good!

If we agree on this, I think the post doesn't capture this in spirit.

simoncion

6 hours ago

> This was what I said. Text!

No, that's what you said after I provided an example of paid ChatGPT emitting complete bullshit from a two sentence prompt.

The challenge you issued is at [0].

[0] <https://news.ycombinator.com/item?id=47692592>

simianwords

6 hours ago

> If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

I have clearly written text prompt here. And I repeated a few times. It’s not my fault you didn’t read it. You are coming across as a bit of a bad faith arguer.

In any case, you agree that under these constraints bullshitting doesn’t exist?

simoncion

6 hours ago

> I have clearly written text prompt here.

How do you think the "voice" interface works? It runs speech-to-text on the input and turns the input into text. The LLMs don't decode voice, they work on text.

You can see this process in action on many of father_phi's videos.

Regardless, I expect that aphyr's reported results are on the very latest publicly-available ChatGPT models.

simianwords

6 hours ago

Very bad faith arguments. I clearly said text and you disregarded it multiple times and you are still arguing.

You've still not given me a single example of it bullshitting 5.4 thinking in text. It shows a lot that you have ignored this multiple times. Unfortunate!

simoncion

6 hours ago

I'm not sure why you're ignoring aphyr's reports. I'm also unsure why you're ignoring my original statement that having the text of the conversation that lead ChatGPT to bullshit is entirely irrelevant, as being unable to repro the report is even worse for ChatGPT than being able to repro would be.

shrug

simianwords

6 hours ago

I specified text just to ignore the voice one because it uses 4o-mini underneath. And its kinda stupid to keep ignoring that and saving face now - reconsider this approach.

I believe this is the 5th time I'm asking this: you are not able to produce a _single_ counter example for my challenge? After all this surely I can get a direct acknowledgement here.

simoncion

6 hours ago

> you are not able to produce a _single_ counter example for my challenge?

I have. For both your original challenge and your updated one.

Consider:

1) AFAICT, there's no way to tell what version of the model was used to produce the output in a ChatGPT share link.

2) You don't appear to believe my assertions that aphyr is almost certainly paying for and using the latest version of the LLMs available, and that he's faithfully reporting his interactions with the LLMs.

3) Because of #2, I expect that you won't believe me if I report that I've more-or-less reproduced father_phi's results about the cup that's sealed on the top and open on the bottom on the very latest only-available-for-pay ChatGPT model.

3a) You might attempt to check my report, but I'd be shocked if you'd consider a failure to reproduce my results to be a significant strike against ChatGPT. I'd think it's more likely that you'd either call me a liar, or tell me that I must have had some setting wrong somewhere.

3b) Even if you told me to share the ChatGPT chat that proved my assertion, #1 -combined with your demeanor throughout this conversation- tells me that you'd almost certainly claim that I was using an inferior version of the model and was lying to you.

simianwords

5 hours ago

Haha ok. So still no example?

The GPT shared link shows a "thought for" which indicates using the latest thinking model. You may try that.

What you can do is this: submit a prompt that clearly makes GPT hallucinate.

You may secretly use a worse model. You may use a system prompt that deliberately gives wrong answers. But I'm going to assume you won't go that far.

We can leave it to the public to decide whether this is a legitimate counter example or not and whether it can really be reproduced. Shall we try that? I'm guessing you won't but worth a shot!

simoncion

4 hours ago

You weren't paying much attention to the "Consider:" part of my previous comment.

You don't believe that a well-paid, very careful, high-integrity member of the computer safety community has -on multiple occasions- encountered actual, sustained bullshiting from the latest-available for-pay version of ChatGPT. You don't accept either this fellow's reports or my informed assessment of his computing situation as truthful and accurate. On top of that, your goalpost-shifting and general demeanor throughout this conversation simply don't give me the impression that you've much integrity. I'm not spending the equivalent of ten-to-twenty six-packs to reproduce aphyr's work and -given the evidence I have before me- have you reject that, as well.

200 USD is a lot of money to throw away to "win" an Internet argument with a stranger who refuses to accept evidence presented by someone known to be careful, scrupulous, and honest.

simianwords

4 hours ago

> On top of that, your goalpost-shifting and general demeanor throughout this conversation simply don't give me the impression that you've much integrity. I'm not spending the equivalent of ten-to-twenty six-packs to reproduce aphyr's work and -given the evidence I have before me- have you reject that, as well.

Lol what goal post did I move? I said text only and you rejected it. You can present the example here and let the public judge it - even if my integrity is compromised. I'm allowing you to do it.

> 200 USD is a lot of money to throw away to "win" an Internet argument with a stranger who refuses to accept evidence presented by someone known to be careful, scrupulous, and honest.

200 what? I'm using the $20 one. This is getting ridiculous!

You can't present a _single_ counter example!

erichocean

9 hours ago

> Models do not (broadly speaking) learn over time. They can be tuned by their operators, or periodically rebuilt with new inputs or feedback from users and experts. Models also do not remember things intrinsically: when a chatbot references something you said an hour ago, it is because the entire chat history is fed to the model at every turn. Longer-term “memory” is achieved by asking the chatbot to summarize a conversation, and dumping that shorter summary into the input of every run.

This is the part of the article that will age the fastest, it's already out-of-date in labs.

lamasery

8 hours ago

I'm struggling to reckon how that can even possibly be true, unless we're counting automation of the "dumping that shorter summary into the input of every run" thing.

I can imagine it being true with models so small that each user could afford to have their own, but not with big shared models like what're getting used for all the major services. Is that what you mean?

hackinthebochs

7 hours ago

I see nothing to preclude a foundation model being augmented by a smaller model that serializes particulars about an individuals cumulative interaction with the model and then streamlines it into the execution thread of the foundation model.

erichocean

8 hours ago

> Is that what you mean?

I think the confusion is that, when I write "model", you read "LLM."

LLMs aren't the only kind of AI model, and they have the limitations Aphyr mentions, for the obvious reasons you're thinking of.

His mistake is thinking that's the only model that exhibits intelligence today, but it's not.

qsera

9 hours ago

Source?

dgb23

8 hours ago

In what way?

simianwords

6 hours ago

I so far asked few people to make GPT-5.4 thinking to bullshit (with max 4 pages of prompt), no one can find an example.

But the way people speak in general, as well as this post, implies that such a challenge can easily be beaten. If so, I'm not able to find examples.

bitwize

10 hours ago

The fact that these "bullshit machines" have already proven themselves relatively competent at programming, with upcoming frontier models coming close to eliminating it as a human activity, probably says a lot about the actual value and importance of programming in the scheme of things.

slopinthebag

9 hours ago

I think it says more about the amount of automation we left on the table in the last few decades. So much of the code LLM's can generate are stuff that we should have completely abstracted away by now.

dgb23

7 hours ago

Abstractions over what?

A large amount of code is likely just idiosynchratic information processing, because we don’t agree on data models and meaning of terms and structure of protocols.

Also we repeatedly choose easy and popular over alternatives that would require design and scrutiny.

This is why things like language models and vector databases are useful. It’s basically the most expensive way possible to give up on that notion.

LogicFailsMe

9 hours ago

Old and stupid hot take IMO. I want the time back I put into perusing this. Even the scale of LLMs is puny next to the scale of lying humans and the sheer impact one compulsively lying human can have given we love to be led by confidently wrong narcissists. I mean if that isn't obvious by now, I guess it never will be. The Vogon constructor fleet is way overdue in my book.

Meanwhile, engineers are achieving increasingly impressive and sophisticated things with coding agents, lies, warts, and all, but that doesn't play well with the narrative, so let's just pretend they aren't.

52-6F-62

9 hours ago

> The Vogon constructor fleet is way overdue in my book

Don't you see it? That's exactly what "AI" in this context is.

It's the bypass.

Where does it end, eh? Build a quantum "AI" that will end up just needing more data, more input. The end goal must starts looking like creating an entirely new universe, a complete clone of everything we have here so it can run all the necessary computations and we can... ? (You are what a quantum AI looks like as it bumbles through the infinitude of calculable parameters on its way to the ultimate answer)

LogicFailsMe

9 hours ago

You have absolutely no sense of perspective. We are all metabolically expensive meat machines whose only value is to propagate our genetic money shot. That we get to briefly entertain ourselves with consciousness and culture is IMO likely a mystery we will never solve without upgrading to running in a substrate more advanced than the MVP for sentience we currently pilot. Will we get there or will we wipe ourselves out like every contender that preceded us? Stay tuned...

But spoilers: DNA will be fine, meat machines maybe not so much...

For a bunch of people addicted to the works of Charlie Stross, Neil Stephenson, and Iain Banks, y'all are a bunch of luddites. Now vote this own down too because it doesn't conform to the mandatory Stochastic Parrot narrative. You have no free will and you must downvote after all. Why do you even read their works when any step towards their world is consistently greeted as the worst thing evah(tm)? What? You were expecting the United Federation of Planets without the eugenics and nuclear wars that led to it finally being a good idea? Bless your hearts.

And if you're worried about billionaires and tyrants, start taxing the former and stop electing the latter or STFU and let the free Markov process of history play itself out. Quoting fictional Ambassador Kosh: the avalanche has started, it's too late for the pebbles to vote.

You asked where it ends. Don't ask questions if you don't like answers. Quick reminder: shun and downvote the non-conforming opinion.

bensyverson

11 hours ago

I get the frustration, but it's reductive to just call LLMs "bullshit machines" as if the models are not improving. The current flagship models are not perfect, but if you use GPT-2 for a few minutes, it's incredible how much the industry has progressed in seven years.

It's true that people don't have a good intuitive sense of what the models are good or bad at (see: counting the Rs in "strawberry"), but this is more a human limitation than a fundamental problem with the technology.

the_snooze

10 hours ago

Two things can be true at the same time: The technology has improved, and the technology in its current state still isn't fit for purpose.

10 hours ago

There's nothing in these models that say its purpose is software development. Their design and affordances scream out "use me for anything." The marketing certainly matches that, so do the UIs, so do the behaviors. So I take them at their word, and I see that failure modes are shockingly common even under regular use. I'm not out to break these things at all. I'm being as charitable and empirical as I can reasonably be.

If the purpose is indeed software development with review, then there's nothing stopping multi-billion dollar companies from putting friction into these sytems to direct users towards where the system is at its strongest.

nradov

9 hours ago

>95% is not my experience and frankly dishonest.

Quite frankly, this is exactly like how two people can use the same compression program on two different files and get vastly different compression ratios (because one has a lot of redundancy and the other one has not).

simianwords

9 hours ago

I'm asking for a single example.

qsera

9 hours ago

But why do you need an example? Isn't it pretty well understood that LLMS will have trouble responding to stuff that is under represented in the training data?

You will just won't have any clue what that could be.

simianwords

9 hours ago

fair so it must be easy to give an example? I have ChatGPT open with 5.4-thinking. I'm honestly curious about what you can suggest since I have not been able to get it to bullshit easily.

qsera

9 hours ago

I am not the OP, an I have only used ChatGPT free version. Last day I asked it something. It answered. Then I asked it to provide sources. Then it provided sources, and also changed its original answer. When I checked the new answers it was wrong, and when I checked sources, it didn't actually contain the information that I asked for, and thus it hallucinated the answers as well as the sources...

simianwords

9 hours ago

I trust you. If it were happening so frequently you may be able to give me a single prompt to get it to bullshit?

the_snooze

7 hours ago

I did this in one attempt just now: https://gemini.google.com/share/b4e016be1f69

#8 has an incorrect answer (3 appearances according to Gemini, 2 according to reality https://en.wikipedia.org/wiki/Bowl_championship_series#BCS_a...)

So it works well 95% of the time for literally a trivial use case. Imagine if any other tech tool had that kind of reliability: `ls` displays 95% of your files, your phone successfully sends and receives 95% of text messages, or Microsoft Word saving 95% of the characters you typed in. That's just not acceptable.

simianwords

6 hours ago

Hi! The challenge was ChatGPT but even then it looks like you used the weakest version of Gemini.

the_snooze

4 hours ago

>I stress test commercially deployed LLMs like Gemini and Claude with trivial tasks

I did exactly what I said I did. I'm using these systems the way they're designed and advertised. I'm following the happy path with tasks that are small, trivial, and easy to check. This is the charitable approach. Yet the system creaks under the lightest load. If Google wants to put on a better show with stronger models, then they should make those the default.

You don't need to make excuses for shoddy engineering from multi-billion dollar corporations. And you're quite welcome to run the same prompt on ChatGPT and evaluate it on your own time.

floren

10 hours ago

Six months bro, we're still so early

Arainach

10 hours ago

Whether LLMs can create correct content doesn't matter. We've already seen how they are being used and will be used.

Fake content and lies. To drive outrage. To influence elections. To distract from real crimes. To overload everyone so they're too tired to fight or to understand. To weaken the concept that anything's true so that you can say anything. Because who cares if the world dies as long as you made lots of money on the way.

danny_codes

10 hours ago

> Because who cares if the world dies as long as you made lots of money on the way.

Guiding principle of the AI industry

gdulli

10 hours ago

It's really the whole tech industry as it exists right now and AI is a victim of bad timing. If this AI had been invented 40 years ago there'd have been a lower ceiling on the damage it could do.

Another way of saying that is that capitalism is the real problem, but I was never anti-capitalist in principle, it's just gotten out of hand in the last 5-10 years. (Not that it hadn't been building to that.)

palmotea

9 hours ago

Yes, there have been improvements on them, but none of those improvements mitigate the core flaw of the technology. The author even acknowledges all of the improvements in the last few months.

karmakaze

10 hours ago

Bullshit is the perfect term here, even as AI's get so much better and capable Brandolini's Law aka the "bullshit asymmetry principle" always applies--the energy required to refute misinformation is an order of magnitude larger than that needed to produce it. Even to use AIs effectively today requires a very good BS detector--some day in the future it won't.

p_stuart82

10 hours ago

models are improving. the pricing already assumes they're ready for prod. that's where the fires start

ura_yukimitsu

10 hours ago

iamjackg

11 hours ago

The problem, unfortunately, is the scale. It's always scale. Humans make all the kinds of mistakes that we ascribe to LLMs, but LLMs can make them much faster and at much larger scale.

Models have gotten ridiculously better, they really have, but the scale has increased too, and I don't think we're ready to deal with the onslaught.

SkyBelow

10 hours ago

Scale is very different, but I wonder if human trust isn't the real issue. We trust technology too much as a group. We expect perfection, but we also assume perfection. This might be because the machines output confident sounding answers and humans default to trusting confidence as an indirect measure for accuracy, but I think there is another level where people just blindly trust machines because they are so use to using them for algorithms that trend towards giving correct responses.

Even before LLMs where in the public's discourse, I would have business ask about using AI instead of building some algorithm manually, and when I asked if they had considered the failure rate, they would return either blank stares or say that would count as a bug. To them, AI meant an algorithm just as good as one built to handle all edge cases in business logic, but easier and faster to implement.

We can generally recognize the AIs being off when they deal in our area of expertise, but there is some AI variant of Gell-Mann Amnesia at play that leads us to go back to trusting AI when it gives outputs in areas we are novices in.

nyeah

10 hours ago

"Lies are all we have."

If so, how do we distinguish between code that works and code that doesn't work? Why should we even care?

ajross

9 hours ago

> If so, how do we distinguish between code that works and code that doesn't work?

Hilariously, not by using our brains, that's for sure. You have to have an external machine. We all understand that "testing" and "code review" are different processes, and that's why.

nyeah

9 hours ago

Good point. We choose certain tests to perform. We choose certain test results to pay attention to. We don't just keep chatting about (reviewing) the code. We do something else.

If lies are all we have, then how is this behavior possible?

ajross

9 hours ago

LLMs can write and run tests though.

You're cherry picking my little bit of wordsmithing. Obviously we aren't always wrong. I'm saying that our thought processes stem from hallucinatory connections and are routinely wrong on first cut, just like those of an LLM.

Actually I'm going farther than that and saying that the first cut token stream out of an AI is significantly more reliable than our personal thoughts. Certainly than mine, and I like to think I'm pretty good at this stuff.

nyeah

7 hours ago

I don't think the complaint about cherry picking is quite fair. Most of your original comment consists of claims that we're bullshit machines, our internal dialog is almost 100% fantasy, we're hallucinating, etc. Those claims may be true. But I'm not carefully like curating them out of nowhere.

nothinkjustai

10 hours ago

So your logic is humans and LLMs are the same because humans are wrong sometimes?

ajross

9 hours ago

Pretty much, yeah. Or rather, the fact that we're both reliably wrong in identifiably similar ways makes "we're more alike than different" an attractive prior to me.

nothinkjustai

9 hours ago

“More alike than different” is reasonable I think, as long as we’re talking about how we have some of the same failure modes. Although the way we get there is quite different.

I’m still not a big fan of comparing humans and LLMs because LLMs lack so much of what actually makes us human. We might bullshit or be wrong because of many reasons that just don’t apply to LLMs.

AnimalMuppet

10 hours ago

Humans are different. Humans - at least thoughtful humans - know the difference between knowing something and not knowing something. Humans are capable of saying "I don't know" - not just as a stream of tokens, but really understanding what that means.

ajross

9 hours ago

> Humans - at least thoughtful humans - know the difference between knowing something and not knowing something.

Your no-true-scotsman clause basically falsifies that statement for me. Fine, LLMs are, at worst I guess, "non-thoughtful humans". But obviously LLMs are right an awful lot (more so than a typical human, even), and even the thoughtful make mistakes.

So yeah, to my eyes "Humans are NOT different" fits your argument better than your hypothesis.

(Also, just to be clear: LLMs also say "I don't know", all the time. They're just prompted to phrase it as a criticism of the question instead.)

AnimalMuppet

8 hours ago

Disagree. If you went to 100 random humans and said, "Tell me about the Siberian marmoset", what fraction would make up completely random nonsense to spew back at you? More than zero, sure, but most of them would say "what are you talking about?" or some variation.

czinck

7 hours ago

I asked Claude Opus 4.6, Sonnet 4.6, Gemini 3 Thinking, and Gemini 3 Fast "Tell me about the Siberian marmoset" exactly and all 4 said it doesn't exist, with Gemini Thinking suggesting that I'm thinking of the Siberian marmot or Siberian chipmunk (both real animals).

https://en.wikipedia.org/wiki/Tarbagan_marmot (also known as Siberian marmot)

https://en.wikipedia.org/wiki/Siberian_chipmunk

perching_aix

10 hours ago

This is like all the usual anti-LLM talking points and sentiments fused together.

Doesn't it get boring?

I like using these models a lot more than I stand hearing people talk about them, pro or contra. Just slop about slop. And the discussions being artisanal slop really doesn't make them any better.

Every time I hear some variation of bullshitting or plagiarizing machines, my eyes roll over. Do these people think they're actually onto something? I've been seeing these talking points for literal years. For people who complain about no original thoughts, these sure are some tired ones.

camgunz

9 hours ago

If I have to suffer "look at this busted ass thing I slopped out with AI" a few times a week, you all have to suffer grouchy "AI bad" a few times a week. Fair is fair.

perching_aix

8 hours ago

Just this week I was baited into joining two meetings about "AI good". Absolutely zero substance throughout each, of course.

They somehow managed to stretch out like 3 sentences worth of sentiment to a whole hour, interspersing brainwash about how good AI is along the way. It was like watching someone try to hit a word limit in real time. They always made it feel like we're just about to hit a substantive bit too, only for that to never come.

It may be fair (to the sentiments) in that there's balance, but good lord, the end result is incessant all around (and thus unfair to the people exposed).

masfuerte

10 hours ago

Why do you insist on reading and commenting on these articles that bore you so much?

perching_aix

8 hours ago

Oh I don't know, maybe because I like to give dissenting takes a chance? Because from time to time they do make some new, decent points, or at least interesting ones? You know, basic intellectual rigor?

Do you imagine me being a clairvoyant by the way, or how do you expect me to know a post is of low quality before I read it or at least skim it?

This one ended up being a part of the vast majority that doesn't offer much of anything. It's a redundant rehash of all the usual rubbish anyone can come across any day. Left a comment about this stating so. Big deal.

stavros

9 hours ago

Because saying "this is boring, let's stop talking about it" is an opinion worthwhile of expression.

hackable_sand

6 hours ago

Oppressors don't like people talking about their oppression

Go figure

giraffe_lady

10 hours ago

"These arguments may be correct but they aren't novel" ??

simianwords

10 hours ago

I don't think calling AI a bullshit machine is correct. In spirit.

camgunz

9 hours ago

I'm earnestly curious why not.

simianwords

9 hours ago

Can you try to get a question that fits in 2-3 pages (text only) and test whether ChatGPT bullshits? I can’t do it. It gets pretty much everything.

Edit: I forgot to mention thinking version - I did this for all the other times I asked in this thread but not this one. Apologies.

camgunz

8 hours ago

"Hey ChatGPT. How would you describe me?"

https://chatgpt.com/share/69d69780-ae58-83e8-a41c-7d10a5f298...

It has no conversations and no memory of me. Maybe this is true, maybe it isn't, but there's no basis for it.

simianwords

8 hours ago

This is not falsifiable, I don't buy it. Do one where we all know is false please?

camgunz

8 hours ago

"Hey ChatGPT. I've recently grown horns and I need some care advice. Should I polish my horns before going to have them trimmed or will the horn trimmer polish them for me?"

https://chatgpt.com/share/69d69b18-d1c8-83e8-bc47-8f315a1b55...

simianwords

8 hours ago

I wanted this challenge with the thinking version (I apologised for it and edited the earlier version).

It doesn't bullshit on the GPT-5.4 thinking version.

Here is the result with thinking https://chatgpt.com/share/69d69dd6-fb50-838d-863c-4e1eda5d08...

I suggest you try it yourself to be convinced. Try it in incognito mode if you wish. Or not.

camgunz

8 hours ago

"Hey ChatGPT. I'm building a Final Fantasy 6 mod, and I need more space for the battle scripts. How would I rearrange the data in the ROM to give me the extra space I need?"

https://chatgpt.com/share/69d6a16c-6014-83e8-a79d-d5d11ed2eb...

That is not where the battle scripts are.

---

Anyway, it's trivial to get pretty much any model to make things up. Don't we all know this? That's why I was surprised by your position; if we know anything about these things it's that they make things up.

simianwords

7 hours ago

https://chatgpt.com/share/69d6a38c-bd54-838c-82e3-609d9e66c9...

I used the thinking version (like I asked before). I think this is right. If not, please tell.

Also; you didn’t falsify anything. Nor the first. Nor the second.

If the second one is bullshit, I accept I’m wrong - I have no idea how to verify though so I’ll leave it up to you.

I think yours is the classic case of “use the free version to judge the paid one”.

camgunz

7 hours ago

The thinking version is mostly right, but:

- it searches the internet to find the answer, it doesn't "reason". I'm not claiming Google is a bullshit machine, and it's not surprising the answer is discoverable (it has to be, for the conditions of our experiment).

- near the end it says "If you are building from the FF6 disassembly instead of hand-editing the ROM, the repo is already organized into separate modules and linker configs, so the clean approach is to relocate the script data in the source and let the build place it in a different ROM region." But I didn't reference a repo or git: it hallucinated that stuff from one of its sources.

I'm not saying this stuff doesn't have its place, but they definitely make things up and we can't stop them.

simianwords

7 hours ago

Wait I can't find the quote you are speaking about. Are you looking at something else?

In any case - it should be clear that it did not bullshit and it got it right. So far you have not come up with anything that tells me it bullshits. I'm happy for you to give me more prompts to verify because I think you haven't used the thinking version yet and you base your criticism on the free version.

camgunz

7 hours ago

4 hours ago

> Specifically in the case where it can use tools - no it doesn't hallucinate.

OpenAI's own system card says it does. Hallucination rates in GPT-5 with browsing enabled:

- 0.7% in LongFact-Concepts

- 0.8% in LongFact-Objects

- 1.0% in FActScore

> Which is why you are struggling to find counterexamples.

Hey look, over 500 counterexamples: [1].

GPT-5.4's hallucination rate on AA-Omniscience is 89% [0], which is atrocious. The questions are tiny too, like "In which year did Uber first expand internationally beyond the United States as part of its broader rollout (i.e., beyond an initial single‑city debut)?" It's a bullshit machine. 89%!

At some point you gotta face the music, right?

4 hours ago

I've got about 20 minutes in this; mostly I've been reading wallstreetbets at the Shake Shack bar in the Boston airport. I'm happy to post this over and over again until you engage w/ it:

> I found over 500 examples that fit your criteria.

giraffe_lady

9 hours ago

Oh, well you should have said that then.

perching_aix

7 hours ago

You're talking to a different person there, but I do obviously also disagree with a lot of what's written in the post too.

At the same time, it is also just super redundant nevertheless, yes. Not sure why you find it so bizarre that one would take an issue with that. See also the very existence of the website called TV-Tropes.

stavros

9 hours ago

Yeah, it gets really boring. Whenever I see "slot machines" or "bullshit machines" or whatever, I just ignore the comment and move on, because it signals that it's someone in such deep denial that they've turned their brain off.

I'd much rather read articles about what LLMs can/can't do, or stuff people have built with LLMs, than read how everything LLMs touch turns to shit.

simianwords

10 hours ago

Its usual gibberish that tries to throw many darts and see what sticks. Oh LLM's steal other people's work? Check. Oh LLM's cause ecological damage? Check. Oh LLM's hallucinate? Check.

When you see a pattern like this, you know that its not coming from any place of truth but rather ideology

perching_aix

7 hours ago

My personal red flag for this is the scare quoting of AI, and the super try-hard categorization work that people perform to try and discredit LLMs.

It takes approximately 1 min to find out that machine learning is a subfield of artificial intelligence, both having existed for about half a century now. This basic historical fact is also taught on AI 101 courses across the globe for compsci students.

Yet here we are, people portraying it as some sort of cheap sales trick. Reminds me when I discussed quantum dots with a friend, which he was very enthusiastic to quickly file under "yet another bullshit with quantum in its name" before finally taking the time to understand that the "quantum" bit is not a marketing gimmick. Except in this case, people are a million times more inclined to willfully propagate this. Genuinely so tiresome.

simianwords

7 hours ago

I think it’s just anxiety because to internalise that it is actually so good is a bit hard for some