"The term 'hallucination' does anthropomorphize LLMs" It does not as hallucinati...

jamilton · on May 3, 2023

That framing fails to describe the case where the model is confident in a response (at the token-level), and is wrong, which I think is still considered hallucinating.

ajcp · on May 3, 2023

How can a model return high-level confidence at a token level on data it can't predict over.

jamilton · on May 3, 2023

Misconceptions. There's no inherent reason a false statement would have lower probability than a true one.

To be clear, I'm referring to things like GPT-3.5 reportedly consistently messing up on statements like "what's heavier, two pounds of feathers or a pound of bricks". Being consistently wrong in the same way implies to me (but I don't know for sure) that the class of response is high probability in an absolute sense.

I can't find the article that demonstrated the sort of things that GPT consistently gets wrong, but it was things like common misconceptions and sayings.

ajcp · on May 3, 2023

Very interesting. So it could produce, with high confidence, common and real-world guesses found in it's dataset.

So in that case it's not guessing and not wrong; it's indeed producing something that is correct, but still false. Now we're really getting into the weeds here though.