fumeux_fume
4 days ago
I like that OpenAI is drawing a clear line on what “hallucination” means, giving examples, and showing practical steps for addressing them. The post isn’t groundbreaking, but it helps set the tone for how we talk about hallucinations.
What bothers me about the hot takes is the claim that “all models do is hallucinate.” That collapses the distinction entirely. Yes, models are just predicting the next token—but that doesn’t mean all outputs are hallucinations. If that were true, it’d be pointless to even have the term, and it would ignore the fact that some models hallucinate much less than others because of scale, training, and fine-tuning.
That’s why a careful definition matters: not every generation is a hallucination, and having good definitions let us talk about the real differences.
freehorse
4 days ago
> What bothers me about the hot takes is the claim that “all models do is hallucinate.” That collapses the distinction entirely
That is a problem for "Open"AI because they want to sell their products, and because they want to claim that LLMs will scale to superintelligence. Not for others.
"Bad" hallucinations come in different forms, and what the article describes is one of them. Not all of them come from complete uncertainty. There are also the cases where the LLM is hallucinating functions in a library, or they reverse cause and effect when summarising a complex article. Stuff like this still happen all the time, even with SOTA models. They do not happen because the model is bad with uncertainty, they have nothing to do with knowledge uncertainty. Esp stuff like producing statements that misinterpret causal relationships within text, imo, reveals exactly the limits of the architectural approach.
p_v_doom
2 days ago
The problem is not so much IMO that all models hallucinate. Its more that our entire reality, especially as expressed through the training data - text, is entirely constructed. There is no difference in the world made by the text, say when it comes to the reality of Abraham Lincoln and Bilbo Baggins. We often talk about the later as if he is just as real. Is Jesus real? Is Jesus god? Is it hallucination to claim the one you dont agree with? We cant even agree amongst oursevles what is real and what is not.
What we perceive as "not hallucination" is merely a very big consensus supported by education, culture, personal beliefs and varies quite a bit. And little in the existence of the model gives it the tools to make those distinctions. Quite the opposite
catlifeonmars
4 days ago
So there are two angles to this:
- From the perspective of LLM research/engineering, saying all LLM generation is hallucination is not particularly useful. It’s meaningless for the problem space.
- From the perspective of AI research/engineering in general (not LLM specific) it can be useful to consider architectures that do not rely on hallucination in the second sense.
druskacik
3 days ago
I like this quote:
'Everything an LLM outputs is a hallucination. It's just that some of those hallucinations are true.'
swores
2 days ago
To me that seems as pointless as saying "everything a person sees is a hallucination, it's just some of those hallucinations are true". Sure, technically whenever we see anything it's actually our brain interpreting how light bounces off stuff and combining that with the mental models we have of the world to produce an image in our mind of what we're looking at... but if we start calling everything we see a hallucination, there's no longer any purpose in having that word.
So instead of being that pedantic, we decided that "hallucination" only applies to when what our brain thinks we see does not match reality, so now hallucination is actually a useful word to use. Equally with LLMs, when people talk about hallucinations part of the definition includes that the output be incorrect in some way. If you just go with your quote's way of thinking about it, then once again the word loses all purpose and we can just scrap it since it now means exactly the same thing as "all LLM output".
1718627440
2 days ago
> everything a person sees is a hallucination, it's just some of those hallucinations are true
Except it's not. People can have hallucinations that are true (dreams), but most perception isn't generated by your brain, but comes from the outside.
hodgehog11
4 days ago
Absolutely in agreement here. This same statement should also be applied to the words "know", "understand", and "conceptualize". "Generalize", "memorize" and "out-of-distribution" should also be cautiously considered when working with systems trained on incomprehensibly large datasets.
We need to establish proper definitions and models for these things before we can begin to argue about them. Otherwise we're just wasting time.
parentheses
a day ago
Yes. Maybe a better way to put it would be, "all models guess every time because they are stochastic in nature. However, we only want the answers with high confidence."
player1234
3 days ago
Correct, it is a useless term with the goal to gaslight and antropmorphise a system that predicts the next token.
vrighter
4 days ago
if you insist that they are different, then please find one logical, non-subjective, way to distinguish between a hallucination and not-a-hallucination. Looking at the output and deciding "this is clearly wrong" does not count. No vibes.
esafak
4 days ago
> Looking at the output and deciding "this is clearly wrong" does not count.
You need the ground truth to be able to make that determination, so using your knowledge does count. If you press the model to answer even when it does not know, you get confabulation. What today's models lack is the ability to measure their confidence, so they know when to abstain.
player1234
3 days ago
There is no such thing as confidence regarding the actual facts, only confidence in probable output from the input. Factual confidence is impossible with current architecture.
1718627440
2 days ago
And having no ground truth is what defines a hallucination.
vrighter
3 days ago
so... vibes. got it. There is no ground truth to compare to in most cases. Because they are not in the training data where you can make objective quantifiable measurements on the statistics.
ttctciyf
4 days ago
"Hallucination" is a euphemism at best, and the implication it carries that LLMs correctly perceive (meaning) when they are not hallucinating is fallacious and disinforming.
The reification of counterfactual outputs which are otherwise indistinguishable from the remainder of LLM production etiologically is a better candidate for the label "hallucination" IMO.
TychoCelchuuu
4 days ago
[dead]