Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn’t it true that the only thing that LLM’s do is “hallucinate”?

The only way to know if it did “hallucinate” is to already know the correct answer. If you can make a system that knows when an answer is right or not, you no longer need the LLM!



Hallucination implies a failure of an otherwise sound mind. What current LLMs do is better described as bullshitting. As the bullshitting improves, it happens to be correct a greater and greater percentage of the time


Sometimes when I am narrating a story I don't care that much about trivial details but focus on the connection between those details. Is there LLM counterpart to such a behaviour? In this case, one can say I was bullshitting on the trivial details.


Everyone does this, all the time, without even trying.

https://en.wikipedia.org/wiki/Memory#Construction_for_genera...


At what ratio of correctness:nonsense does it cease to be bullshitting? Or is there no tipping point so long as the source is a generative model?


It has nothing to do with ratio and to do with intent. Bullshitting is what we say you do when you just spin a story with no care for the truth, just make up stuff that sound plausible. That is what LLMs do today, and what they will always do as long as we don't train them to care about the truth.

You can have a generative model that cares about the truth when it tries to generate responses, its just the current LLMs don't.


> You can have a generative model that cares about the truth when it tries to generate responses, its just the current LLMs don't.

How would you do that, when they don’t have any concept of truth to start with (or any concepts at all).


You can program a concept of truth into them, or maybe punishing it for making mistakes instead of just rewarding it for replicating text. Nobody knows how to do that in a way that get intelligent results today, but we know how to code things that outputs or checks truths in other contexts, like wolfram alpha is capable of solving tons of things and isn't wrong.

> (or any concepts at all).

Nobody here said that, that is your interpretation. Not everyone who is skeptical of current LLM architectures future potential as AGI thinks that computers are unable to solve these things. Most here who argues against LLM don't think the problems are unsolvable, just not solvable by the current style of LLMs.


> You can program a concept of truth into them, ...

The question was, how you do that?

> Nobody here said that, that is your interpretation.

What is my interpretation?

I don't think that the problems are unsolvable, but we don't know how to do it now. Thinking that "just program the truth in them" shows a lack of understanding of the magnitude of the problem.

Personally I'm convinced that we'll never reach any kind of AGI with LLM. They are lacking any kind of model about the world that can be used to reason about. And the concept of reasoning.


> The question was, how you do that?

And I answered, we don't know how you do that which is why we don't currently.

> Personally I'm convinced that we'll never reach any kind of AGI with LLM. They are lacking any kind of model about the world that can be used to reason about. And the concept of reasoning.

Well, for some definition of LLM we probably could. But probably not the way they are architected today. There is nothing stopping a large language model to add different things to its training steps to enable new reasoning.

> What is my interpretation?

Well, I read your post as being on the other side. I believe it is possible to make a model that can reason about truthiness, but I don't think current style LLMs will lead there. I don't know exactly what will take us there, but I wouldn't rule out an alternate way to train LLMs that looks more like how we teach students in school.


Key words like "epistemology" in the prompt. Chat GPT generally outperforms humans in epistemology substantially in my experience, and it seems to "understand" the concept much more clearly and deeply, and without aversion (lack of an ego or sense of self, values, goals, desires, etc?).


> It has nothing to do with ratio and to do with intent. Bullshitting is what we say you do when you just spin a story with no care for the truth, just make up stuff that sound plausible

Do you people hear yourselves? You're discussing the state of mind of a pseudo-RNG...


ML models intent is the reward function it has. They strive to maximize rewards, just like a human does. There is nothing strange about this.

Humans are much more complex than these models so they have much more concepts and stuff which is why we need psychology. But some core aspects works the same in ML and in human thinking. In those cases it is helpful to use the same terminology for humans and machine learning models, because that helps transfer understanding from one domain to the other.


I had this perfect mosquito repellent - all you had to do was catch the mosquito and spray the solution into his eyes blinding him immediately.


Does every thread about this topic have to have someone quibbling about the word “hallucination”, which is already an established term of art with a well understood meaning? It’s getting exhausting.


The term hallucination is a fundamental misunderstanding of how LLMs work, and continuing to use it will ultimately result in a confused picture of what AI and AGI are and what is "actually happening" under the hood.

Wanting to use accurate language isn't exhausting, it's a requirement if you want to think about and discuss problems clearly.


Arguing about semantics rarely keeps topics on track, e.g, my reply to your comment.


"Arguing about semantics" implies that there is no real difference between calling something A vs. calling it B.

I don't think that's the case here: there is a very real difference between describing something with a model that implies one (false) thing vs. a model that doesn't have that flaw.

If you don't find that convincing, then consider this: by taking the time to properly define things at the beginning, you'll save yourself a ton of time later on down the line – as you don't need to untangle the mess that resulted from being sloppy with definitions at the start.

This is all a long way of saying that aiming to clarify your thoughts is not the same as arguing pointlessly over definitions.


"Computer" used to mean the job done by a human being. We chose to use the meaning to refer to machines that did similar tasks. Nobody quibbles about it any more.

Words can mean more than one thing. And sometimes the new meaning is significantly different but once everyone accepts it, there's no confusion.

You're arguing that we shouldn't accept the new meaning - not that "it doesn't mean that" (because that's not how language works).

I think it's fine - we'll get used to it and it's close enough as a metaphor to work.


I'd be willing to bet that people did quibble about what "computer" meant at the time the meaning was transitioning.

It feels like you're assuming that we're already 60 years past re-defining "hallucination" and the consensus is established, but the fact that people are quibbling about it right now is a sign that the definition is currently in transition/ has not reached consensus.

What value is there in trying to shut down the consensus-seeking discussion that gave us "computer"? The same logic could be used arguing that "computers" are actually be called "calculators" and why are people still trying to call it a "computer"?


Does every completely legitimate condemnation of erroneous language have to be whined about by some apologist for linguistic erosion?


you stole a term which means something else in an established domain and now assert that the ship has sailed, whereas a perfectly valid term in both domains exists. don't be a lazy smartass.

https://en.wikipedia.org/wiki/Confabulation


That's actually what the paper is about. I don't know why they didn't use that in the title.

> Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations.


This is exactly how language works; words are adopted across domains and change meaning over time.


If there's any forum which can influence a more correct name for a concept it's this one, so please excuse me while I try to point out that contemporary LLMs confabulate and hallucinating should be reserved for more capable models.


It is exhausting, but so is the misconception that the output of an LLM can be cleanly divided into two categories.


If the meaning was established and well understood, this wouldn't happen in every thread.


It’s well understood in the field. It’s not well understood by laymen. This is not a problem that people working in the field need to address in their literature.


We're mostly laymen here.


The paper itself talks about this, so yes?


All people do is confabulate too.

Sometimes it is coherent (grounded in physical and social dynamics) and sometimes it is not.

We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.


> We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.

The fact that it isn't possible to be right about 100% of things doesn't mean that you shouldn't try to be right.

Humans generally try to be right, these models don't, that is a massive difference you can't ignore. The fact that humans often fails to be right doesn't mean that these models shouldn't even try to be right.


By their nature, the models don’t ‘try’ to do anything at all—they’re just weights applied during inference, and the semantic features that are most prevalent in the training set will be most likely to be asserted as truth.


They are trained to predict next word that is similar to the text they have seen, I call that what they "try" to do here. A chess AI tries to win since that is what it was encouraged to do during training, current LLM try to predict the next word since that is what they are trained to do, there is nothing wrong using that word.

This is an accurate usage of try, ML models at their core tries to maximize a score, so what that score represents is what they try to do. And there is no concept of truth in LLM training, just sequences of words, they have no score for true or false.

Edit: Humans are punished as kids for being wrong all throughout school and in most homes, that makes human try to be right. That is very different from these models that are just rewarded for mimicking regardless if it is right or wrong.


> That is very different from these models that are just rewarded for mimicking regardless if it is right or wrong

That's not a totally accurate characterization. The base models are just trained to predict plausible text, but then the models are fine-tuned on instruct or chat training data that encourages a certain "attitude" and correctness. It's far from perfect, but an attempt is certainly made to train them to be right.


They are trained to replicate text semantically and then given a lot of correct statements to replicate, that is very different from being trained to be correct. That makes them more useful and less incorrect, but they still don't have a concept of correctness trained into them.


Exactly, if a massive data poisoning would happen, will the AI be able to know what’s the truth is there is as much new false information than there is real one ? It won’t be able to reason about it


> Humans generally try to be right,

I think this assumption is wrong, and it's making it difficult for people to tackle this problem, because people do not, in general, produce writing with the goal of producing truthful statements. They try to score rhetorical points, they try to _appear smart_, they sometimes intentionally lie because it benefits them for so many reasons, etc. Almost all human writing is full of a range of falsehooods ranging from unintentional misstatements of fact to out-and-out deceptions. Like forget the politically-fraught topic of journalism and just look at the writing produced in the course of doing business -- everything from PR statements down to jira tickets is full of bullshit.

Any system that is capable of finding "hallucinations" or "confabulations" in ai generated text in general should also be capable of finding them in human produced text, which is probably an insolvable problem.

I do think that since the models do have some internal representation of certitude about facts,that the smaller problem of finding potential incorrect statements in its own produced text based on what it knows about the world _is_ possible, though.


It is an unsolved problem for humans .


If you'd read the aticle you might have noticed that generating answers with the LLM is very much part of the fact-checking process.


The answer is no, otherwise this paper couldn't exist. Just because you can't draw a hard category boundary doesn't mean "hallucination" isn't a coherent concept.


(the OP is referring to one of the foundational concepts relating to the entropy of a model of a distribution of things -- it's not the same terminology that I would use but the "you have to know everything and the model wouldn't really be useful" is why I didn't end up reading the paper after skimming a bit to see if they addressed it.

It's why this arena things are a hard problem. It's extremely difficult to actually know the entropy of certain meanings of words, phrases, etc, without a comical amount of computation.

This is also why a lot of the interpretability methods people use these days have some difficult and effectively permanent challenges inherent to them. Not that they're useless, but I personally feel they are dangerous if used without knowledge of the class of side effects that comes with them.)


The idea behind this research is to generate answer few times and if results are semantically vastly different from each other then probably they are wrong.


> Isn’t it true that the only thing that LLM’s do is “hallucinate”?

The Boolean answer to that is "yes".

But if Boolean logic were a god representation of reality, we would already have solved that AGI thing ages ago. On practice, your neural network is trained with a lot of samples, that have some relation between themselves, and to the extent that those relations are predictable, the NN can be perfectly able to predict similar ones.

There's an entire discipline about testing NNs to see how well they predict things. It's the other side of the coin of training them.

Then we get to this "know the correct answer" part. If the answer to a question was predictable from the question words, nobody would ask it. So yes, it's a definitive property of NNs that they can't create answers for questions like people have been asking those LLMs.

However, they do have an internal Q&A database they were trained on. Except that the current architecture can not know if an answer comes from the database either. So, it is possible to force them into giving useful answers, but currently they don't.


This isn’t true in the way many np problems are difficult to solve but easy to verify.


profound but disagree

the fact checker doesn’t synthesize the facts or the topic




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: