dragonwriter
a day ago
> My former colleague Rebecca Parsons, has been saying for a long time that hallucinations aren’t a bug of LLMs, they are a feature. Indeed they are the feature. All an LLM does is produce hallucinations, it’s just that we find some of them useful.
This is an example of my least favorite style of feigned insight: redefining a term into meaninglessness just so you can say something that sounds different while not actually saying anything new.
Yes, if you redefine "hallucination" from "produce output containing detailed information despite that information not being grounded in external reality, in a manner distantly analogous to a human reporting sense data produced by a literal hallucination rather than the external inputs that are presumed normally to ground sense data" to "produce output", its true that all LLMs do is "hallucinate", and that "hallucinating" is not a undesirable behavior.
But you haven't said anything new about the thing that was called "hallucination" by everyone else, or about the thing--LLM output in general--that you have called "hallucination". Everyone already knew that producing output wasn't undesirable. You've just taken the label conventionally attached to a bad behavior, attached it to a broader category that includes all behavior, and used the power of equivocation to make something that sounds novel without saying anything new.
scott_w
a day ago
Fowler is not really redefining "hallucination." He's using a form of irony that emphasises how fundamental "hallucinations" are to the operation of the system. One might also say "you can't get rid of collateral damage from bombs. Indeed, collateral damage is the feature, it's just some of that is what we want to blow up."
You're not meant to take it literally.
xyzzy123
a day ago
You might as well say it's interpolating or extrapolating. That's what people are usually doing too, even when recalling situations that they were personally involved in.
I think we call it "hallucinating" when the machine does this in an un-human-like way.
dapperdrake
a day ago
The longer term for this is "stochastic parrot". See another HN comment here comparing LLMs to theater actors or movie actors.
LLMs just spew words. It just so happens that human beings can decode them into something related, useful, and meaningful surprisingly often.
Might even be a useful case of pareidolia (a term I dislike, because a world without any pattern matching whatsoever would not necessarily be "better").
jychang
a day ago
I dislike the term “stochastic parrot”, because there’s plenty of evidence that LLMs do have an understanding of at least some things that they are saying.
We can trace which neurons activate for a face recognition model and see that a certain neuron does light up when it sees a face. The correct features are active for the sentence “the word parrots is plural”.
If you stop assuming the LLMs have no internal representations of the data, then everything makes a lot more sense! The LLM is FORCED to answer questions… just like a high school student filling out the SAT is forced to answer questions.
If a high school student fills out the wrong answer on the SAT, is that a hallucination?
Hallucinations are expected behaviors if you RLHF a high schooler to always guess an answer on the SAT because that’ll get them the highest score. This applies to ML model reward functions as well.
ACCount37
a day ago
This matches what we know about LLMs and hallucination-avoidance behavior in LLMs.
"Wrong answers on SAT" is also the leading hypothesis on why o3 was such an outlier - far more prone to hallucinations than either prior or following OpenAI models.
On SAT, giving a random answer is right 20% of the time - more if you ruled at least one obviously wrong answer out. Saying "I don't know" and not answering is right 0% of the time. So if you RLVR on SAT-type tests, where any answer is better than no answer, you encourage hallucinations. Hallucination avoidance in LLMs is a fragile capability, and OpenAI has probably fried its o3 with too much careless RLVR.
But another cause of hallucinations is limited self-awareness of modern LLMs. And I mean "self-awareness" in a very mechanical, no-nonsense fashion: "has information about itself and its own capabilities". LLMs have very little of that.
Humans have some awareness of the limits of their knowledge - not at all perfect, but at least there's something. LLMs get much, much less of that. LLMs learn the bulk of their knowledge from pre-training data, but pre-training doesn't teach them a lot about where the limits of their knowledge lie.
neonspark
a day ago
> But another cause of hallucinations is limited self-awareness of modern LLMs… Humans have some awareness of the limits of their knowledge
Until you said that I didn’t realize just how much humans “hallucinate“ in just the same ways that AI does. I have a friend who is fluent in Spanish, a native speaker, but got a pretty weak grammar education when he was in high school. Also, he got no education at all in Critical thinking, at least not formally. So this guy is really, really fluent in his native language, but can often have a very difficult time explaining why he uses whatever grammar he uses. I think the whole world is realizing how little our brains can correctly explain and identify the grammar we use flawlessly.
He helps me to improve my Spanish a lot, he can correct me with 100% accuracy of course, but I’ve noticed on many occasions, including this week, that when I ask a question about why he said something one way or another in Spanish, he will just make up some grammar rule that doesn’t actually exist, and is in fact not true.
He said something like “you say it this way when you really know the person and you’re saying that the other way when it’s more formal“, but I think really it was just a slangy way to mis-stress something and it didn’t have to do with familiar/formal or not. I’ve learned not to challenge him on any of these grammar rules that he makes up, because he will dig his heels in, and I’ve learned just to ignore him because he won’t have remembered this made up grammar rule in a week anyway.
This really feels like a very tight analogy with what my LLM does to me every day, except that when I challenge the LLM it will profusely apologize and declare itself incorrect even if it had been correct after all. Maybe LLMs are a little bit too humble.
I imagine this is a very natural tendency in humans, and I imagine I do it much more than I’m aware of. So how do humans use self-awareness to reduce the odds of this happening?
I think we mostly get trained in higher education to not trust the first thought that comes into our head, even if it feels self consistent and correct. We eventually learn to say “I don’t know” even if it’s about something that we are very, very good at.
thrawa8387336
a day ago
Spanish in particular has more connotations per word than English. It's not even the grammar or spelling, those have rules and that's that. But choosing appropriate words, is more like every word has it right place and time and context. Some close examples would be the N-word or the R-word in English, as they are steeped in meanings far beyond the literal.
skydhash
a day ago
He said something like “you say > it this way when you really know the person and you’re saying that the other way when it’s more formal“, but I think really it was just a slangy way to mis-stress something and it didn’t have to do with familiar/formal or not.
There’s such a thing in Spanish and in French. Formal and informal settings is reflected in the language. French even distinguishes between three different level of vocabulary (one for very informal settings (close friends), one for business and daily interactions, and one for very formal settings. It’s all cultural.
_heimdall
a day ago
> We can trace which neurons activate for a face recognition model and see that a certain neuron does light up when it sees a face.
Seeing which parts of a model (they aren't neurons) light up when shown a face doesn't necessarily indicate understanding.
The model is a complex web of numbers representing a massively compressed data space. It could easily be that what you see light up when shown a face only indicates what specific part of the model is housing the compresses data related to recognizing specific facial features.
neonspark
a day ago
> Seeing which parts of a model (they aren't neurons)…
I thought models were composed of neural network layers, among other things. Are these data structures called something different?
_heimdall
20 hours ago
That point may not have been relevant for me to include.
I was getting at the idea that a neuron is a very specific feature of a biological brain, regardless of what AI researchers may call their hardware they aren't made of neurons.
jychang
17 hours ago
1. They are neurons, whether you like it or not. A binary tree may not have squirrels living in them, but it's still a tree, even though the word "tree" here is defined differently than from biology. Or are you going to say a binary tree is not a tree?
2. You are about 5 years behind in terms of the research. Look into hierarchical feature representation and how MLP neurons work. (Or even in older CNNs and RNNs etc). And I'm willingly using the word "neuron" instead of "feature" here because while I know "feature" is more correct in general, there are definitely small toy models where you can pinpoint an individual neuron to represent a feature such as a face.
_heimdall
16 hours ago
What were you getting at with the MLP example? MLPs do a great job with perception abilities and I get that they use the term neuron frequently. I disagree with the use of the name there that's all, similarly I disagree that LLMs are AI but here we are.
Using the term neuron there and meaning it literally is like calling an airplane a bird. I get that the colloquial use exists, but no one thinks they are literal birds.
jychang
15 hours ago
Do you also disagree with the use of the name “tree” in a computer science class?
Again, nobody thinks trees in computer science contains squirrels, nobody thinks airplanes are birds, and nobody thinks a neuron in a ML model contains axons and dendrites. This is a weird hill to die on.
Are you gonna complain that the word “photograph” is “light writing” but in reality nobody is writing anything so therefore the word is wrong?
_heimdall
14 hours ago
I would disagree with anyone that wants to say they are the same as a natural tree, sure.
I don't believe the term photograph was repurposed when cameras were invented, that example doesn't fit.
More importantly, I argued that neuron has a very specific biological meaning and its a misuse to use the term for what is ultimately running on silicon.
Your claim was that they are neurons, period. You didn't expand on that further which reads as a pretty literal use of the term to me. We're online discussing in text, that reading of your comment could be completely wrong, that's fine. But I stand by my point that what is inside an LLM or a GPU is not a neuron.
the_af
a day ago
I think this could be seen as a proxy for evidence that there's some degree of reasoning, if we think we can identify specialized features that always become involved in some kinds of outputs. It's not proof, but it's not nothing either. It has some parallels about research on human brains are conducted, right?
_heimdall
20 hours ago
It does have parallels to the human brain, absolutely. We've been studying the human brain in similar ways for much longer though and we still don't know much about it.
We do know what areas of a human brain often light up in response to various conditions. We don't know why that is though, or how it actually works. Maybe more importantly for LLMs, we don't know how human memory works, where it is stored, how to recognize or even define consciousness, etc.
Seeing what areas of a brain or an LLM light up can be interesting, but I'd be very cautious trying to read much into it.
sillyfluke
a day ago
>I dislike the term “stochastic parrot”, because there’s plenty of evidence that LLMs do have an understanding of at least some things that they are saying.
It's bold to use the term "understanding" in this context. You ask it something about a topic, it gives an answer like someone who understands the topic. You change the prompt slightly, where a human who understands the topic would still give the right response trivially, the LLM outputs an answer that is both wrong/irrelevant and unpredicably and non-humanly wrong in a way that no human who exhibited understanding with the first answer could be predicted to answer the second question in the same bizarrw manner as the LLM.
The fact that the LLM can be shown to have some sort of internal representation does not necessarily mean that we should call this "understanding" in any practical sense when discussing these matters. I think it's counterproductive in getting to the heart of the matter.
naasking
a day ago
> You change the prompt slightly, where a human who understands the topic would still give the right response trivially, the LLM outputs an answer that is both wrong/irrelevant and unpredicably and non-humanly wrong in a way that no human who exhibited understanding with the first answer could be predicted to answer the second question in the same bizarrw manner as the LLM.
I think this should make you question whether the prompt change was really as trivial as you imply. Providing an example of this would elucidate.
roadside_picnic
21 hours ago
Here's an entire paper [0] showing the impact of extremely minor structural changes on the quality of the results of the model. Things as simple as not using a colon in the prompt can lead to notably degraded (or improved) performance.
aatd86
a day ago
But..but...humans do the same thing? This is input/output with the output being formed from applying the function of our life experiences (training) to the input? They just don't have the hormonal circuitry to optimise for whatever our bodies are trying to optimise for when we take decisions.
cmiles74
a day ago
People actually understand the input and the output. An LLM understand neither, it's generating output that is statistically likely, within some bounds. As Fowler said, it's a pleasant coincidence that some of this output has value to us.
(For sure, arguments can be made that the relationship betweens the terms that the model has encoded could maybe be called "model thinking" or "model understanding" but it's not how people work.)
seunosewa
a day ago
We do the same thing. We pick words that are statistically likely to get us what we want. And much of it is unconscious. You don't formally reason about every word you speak. You are focused on your objective, and your brain fills in the gaps.
scott_w
18 hours ago
We absolutely do not "pick words that are statistically likely to get us what we want." We use words to try to articulate (to varying levels of success) a message that we want to communicate. The words, tone, speed, pitch, etc. all convey meaning.
> And much of it is unconscious.
That does not mean we're "picking words statistically likely to get us what we want," it means "our brains do a lot of work subconsciously." Nothing more.
> You are focused on your objective, and your brain fills in the gaps.
This is a total contradiction of what you said at the start. LLMs are not focused on an objective, they are using very complex statistical algorithms to determine output strings. There is no objective to an LLM's output.
seunosewa
6 hours ago
The LLM objective is whatever they are trained to do, whether it's completing text, obeying instructions, coding, etc.
In pre-training, we drop a lot of human-written text in them. This allows them to learn the rules of language and grammar and common language patterns. At this stage, the objective is to predict the next token that makes sense to human beings.
Examples: The capital of US is ... Why did the chicken ...
The next step is instruct training, where they are trained to follow instructions. At this point, they are predicting the next token that will satisfy the user's instructions. They are rewarded for following instructions.
Next step, they are trained to reason by feeding them with reasoning examples to get them going, and then rewarding them whenever their reasoning leads them to good answers. They learn to predict the next reasoning token that will lead them to the best answers.
The objective is imparted by their training. They are "rewarded" when their output satisfies the objective, so that as they are trained, they get better and better at achieving the objectives of the training.
aatd86
15 hours ago
There is an objective. Solving an optimization problem. Or seen otherwise, given a matrix of predicates, it tries to compute some kind of final value that is as close to 1 as possible by applying the matrix to your input prompt. This is more or less what it does.
aatd86
15 hours ago
Interesting that you say that. What is actually 'understanding'? A semantic mapping between tokens (words) and objects and relations between those objects? How would you define it?
naasking
a day ago
> LLMs just spew words. It just so happens that human beings can decode them into something related, useful, and meaningful surprisingly often.
This sentence is inherently contradictory. If LLM output is meaningful more than chance, then it's literally not "just spewing words". Therefore whatever model it is using to generate that meaning must contain some semantic content, even if it's not semantic content that's as rich as humans are capable of. The "stochastic parrot" term is thus silly.
thrawa8387336
a day ago
It's a sufficiently large N shannonizer. Nothing more
jonnycomputer
a day ago
so are we, maybe.
naasking
a day ago
Definitely true at some level:
thwarted
a day ago
It's meant to contrast/correct the claim that
"LLMs produce either truth or they produce hallucinations."
Claims worded like that give the impression that if we can just reduce/eliminate the hallucinations, all that will remain will be the truth.
But that's not the case. What is the case is that all the output is the same thing, hallucination, and that some of those hallucinations just so happen to reflect reality (or expectations) so appears to embody truth.
It's like rolling a die and wanting one to come up, and when it does saying "the die knew what I wanted".
An infinite number of monkeys typing on an infinite number of typewriters will eventually produce the script for Hamlet.
That doesn't mean the monkeys know about Shakespeare, or Hamlet, or even words, or even what they are doing being chained to typewriters.
We've found a way to optimize the infinite typing monkeys to output something that passes for Hamlet much sooner than infinity.
sceptic123
a day ago
I actually found that comment interesting. It's pointing towards something I've struggled with around LLMs. They are (currently) incapable of knowing if what they output is correct, so the idea that "it's all hallucinations" acknowledges that point and gives useful context for anyone using LLMs for software development.
Eridrus
a day ago
Humans are also incapable of knowing whether their output is correct. We merely convince ourselves that it is and then put our thoughts in contact with the external world and other people to see if we actually are.
jimbokun
a day ago
So then we are capable of knowing whether our output is correct, by putting it into contact with the external world.
dghlsakjg
a day ago
Humans are capable of using formal logic to reach a provably correct conclusion from valid premises. That conclusion can then be used as premises for a further conclusion. Follow the logic the other way and using our powers of observation, and sensing, we can gather base level truths and know that what we are saying is correct in an absolute sense.
Computers use formal logic all the time to output truth, LLMs, however, do not.
bartread
a day ago
I’ve never liked that this behaviour is described using the term “hallucination”.
If a human being talked confidently about something that they were just making up out of thin air by synthesizing based (consciously or unconsciously) on other information they know you wouldn’t call it “hallucination”: you’d call it “bullshit”.
And, honestly, “bullshit” is a much more helpful way of thinking about this behaviour because it somewhat nullifies the arguments people make against the use of LLMs due to this behaviour. Fundamentally, if you don’t want to work with LLMs because they sometimes “bullshit”, are you planning on no longer working with human beings as well?
It doesn’t hold up.
But, more than that, going back to your point: it’s much harder to redefine the term “bullshit” to mean something different to the common understanding.
All of that said, I don’t mind the piece and, honestly, the “I haven’t the foggiest” comment about the future of software development as a career is well made. I guess it’s just a somewhat useful collection of scattered thoughts on LLMs and, as such, an example of a piece where the “thoughts on” title fits well. I don’t think the author is trying to be particularly authoritative.
dragonwriter
a day ago
> I’ve never liked that this behaviour is described using the term “hallucination”.
I have a standard canned rant about "confabulation" is a much better metaphor, but it wasn't the point I was focussed on here.
> Fundamentally, if you don’t want to work with LLMs because they sometimes “bullshit”, are you planning on no longer working with human beings as well?
I will very much not voluntarily rely on a human for particular tasks if that human has demonstrated a pattern of bullshitting me when given that kind of task, yes, especially if, on top of the opportunity cost inherent in relying on a person for a particular task, I am also required to compensate them—e.g., financially—for their notional attention to the task.
scott_w
a day ago
> If a human being talked confidently about something that they were just making up out of thin air by synthesizing based (consciously or unconsciously) on other information they know you wouldn’t call it “hallucination”: you’d call it “bullshit”.
I'd recommend you watch https://www.youtube.com/watch?v=u9CE6a5t59Y&t=2134s&pp=ygUYc... which covers the topic of bullshit. I don't think we can call LLM output "bullshit" because someone spewing bullshit has to not care about whether what they're saying is true or false. LLMs don't "care" about anything because they're not human. It's better to give it an alternative term to differentiate it from the human behaviour, even if the observed output is recognisable.
socksy
a day ago
It's precisely because they can't care that they are by definition bullshit machines. See https://link.springer.com/article/10.1007/s10676-024-09775-5
scott_w
21 hours ago
I disagree with the article’s thesis completely. Humans are the ones that spread the bullshit, the LLM just outputs text. Humans are the necessary component to turn that text from “output” into “bullshit.” The machine can’t do it alone.
movpasd
a day ago
At the risk of sounding woo, I find some parallels in how LLMs work to my experiences with meditation and writing. My subjective experience of it is that there is some unconscious part of my brain that supplies a scattered stream of words as the sentence forms --- without knowing the neuroscience of it, I could speculate it is a "neurological transformer", some statistical model that has memorised a combination of the grammar and contextual semantic meaning of language.
The difference is that the LLM is _only that part_. In producing language as a human, I filter these words, I go back and think of new phrasings, I iterate --- in writing consciously, in speech unconsciously. So rather than a sequence it is a scattered tree filled with rhetorical dead ends, pruned through interaction with my world-model and other intellectual faculties. You can pull on one thread of words as though it were fully-formed already as a kind of Surrealist exercise (like a one-person cadavre exquis), and the result feels similar to an LLM with the temperature turned up too high.
But if nothing else, this highlights to me how easily the process of word generation may be decoupled from meaning. And it serves to explain another kind of common human experience, which feels terribly similar to the phenomenon of LLM hallucination: the "word vomit" of social anxiety. In this process it suddenly becomes less important that the words you produce are anchored to truth, and instead the language-system becomes tuned to produce any socially plausible output at all. That seems to me to be the most apt analogy.
jimbokun
a day ago
"Bullshit engine" is the term that best explains to a lay person what it is that LLMs do.
cess11
a day ago
The point, though awkwardly stated, is that there is no difference between 'hallucination' output and any other output from these compressed databases, just like there is no such difference between two queries on a typical RDBMS.
It's a good point.
dragonwriter
20 hours ago
> The point, though awkwardly stated, is that there is no difference between 'hallucination' output and any other output from these compressed databases,
But there is, except when you redefine "hallucination" so there isn't. And, when you retain the definition where there is a difference, you find there are techniques by which you can reduce hallucinations, which is important and useful. Changing the definition to eliminate the distinction is actively harmful to understanding and productive use of LLMs, for the benefit of making what superficially seems like an insightful comment.
belter
a day ago
> You've just taken the label conventionally attached to a bad behavior, attached it to a broader category that includes all behavior, and used the power of equivocation to make something that sounds novel without saying anything new.
You mean like every evangelist says AI changes everything every 5 min...When in reality what they mean is neural nets statistical code generators are getting pretty good? because that is almost all AI there is at the moment?
Just to make the current AI sound bigger than it is?
ljm
a day ago
If everything is a hallucination then nothing is a hallucination.
Thanks for listening to my Ted Talk.