Is it much worse than trusting Wikipedia or another encyclopedia? Maybe it is easier to make ChatGPT give you bad advice while encyclopedias are quite dry?
ChatGPT can just send you something that is completely wrong, and you have no way of knowing. That's why it's bad. On Wikipedia, for example, there is page history, page discussions, rules about sources, sources, and you can see who wrote what. Additionally, its likely someone knowledgeable has looked at the EXACT text you're reading, with all implied and not implied nuances.
ChatGPT doesn't get nuances. It doesn't get subtle differences. It also gets large amounts of information wrong.
> ChatGPT can just send you something that is completely wrong, and you have no way of knowing.
This is true, if you decide to take a ChatGPT answer at face value without any further work. Personally I find it useful sometimes to ask an LLM a question, get an answer and the verify that answer for myself. Doing web searches and pulling together relevant information to get the answer for a question can be harder than getting an answer and then looking to verify it. Perhaps something like that was going on here, impossible to know of course.
Here's an example: When asked about path buffer length in a programming context, ChatGPT 4-o claimed today tht 256 bytes is sufficient for *most systems*. That's an entirely false claim, like, completely invalid. It only says this because that's the tone that is expected of it. You can clearly tell that the info it wanted to convey was "256 is sufficient [here]" but it LOVES just making things sound more general than they are.
you aren't gonna look up if that little detail is right; you're gonna slowly absorb more and more subtly false info.
the point of it is that I don't have to check. otherwise now i've just added an extra set of typing and validation.
plus, now i've been biased by the immediate response. if it says "these CVEs don't have vulnerabilities" then I'm now thinking they're probably okay and just need to validate, instead of starting from zero and doing due diligence. this will lead to confirmation biases or laziness.
Everyone sees the same Wikipedia, what if chatgpt or grok gave a different answer to constitutional questions if the user's ip were, say, from a DoD network? Nobody would know.
I do not have the same trust in Wikipedia. My experience as an editor is that for each page there are a few people who think they own the page, and they remove any edit that affects their text.
Actually, there is an incentive to remove edits in Wikipedia if you want to be part of the ego-fueled bureaucracy that considers WP as their property.
With some humans, you can at least rely on their humility and ability to say "I don't know". This is a positive trait in people and I would rely on such honest people much more than on anyone who has all the answers to everything.
The machine seems to be unable to say or even detect that it does not know. At the same time, it communicates in flawless English (or whatever the current setting is), which is a trait we tend to associate with highly educated people from the real world. This short-circuits our bullshit detectors a bit.
> With some humans, you can at least rely on their humility and ability to say "I don't know". This is a positive trait in people and I would rely on such honest people much more than on anyone who has all the answers to everything.
You might, and I try to. Humanity as a whole? In practice, highly confident people who are totally sure but wrong, still get listened to over people who are humble and aware of their limits.
Humans also short-circuit each other's BS detectors.
The bias to assume that computers are going to produce correct answers is extremely strong.
People intuit that Wikipedia is written by people, so they can apply that knowledge appropriately.
For some reason, most people have a knee jerk reaction to a fully synthetic statement that biases them strongly towards the assumption of veracity.
I always think of LLMs as “my functioning alcoholic veteran friend bob, who has several PHDs and was blown up a couple of times in Iraq”. That seems to be a good framework in order to intuit the usefulness of llm generated output.
"The bias to assume that computers are going to produce correct answers is extremely strong."
This. We know that computers are very good at actual computation, and we don't expect them to go completely haywire in conversations either.
Though this is beginning to change, with the observation of just how blatant some of the hallucations are, accusing random people of serious crimes etc. But the pro-computer bias is still strong.
There was an awful case of a system in the UK which accused postal officers of defraudation. The software malfunctioned, but people were indicted and punished by the courts relying on infallibility of computers, and some of the innocent victims committed suicide out of shame.
1. LLMs are put in a position where everything they say is clearly based on encyclopedic knowledge of absolutely everything
2. LLMs try to use language that is very general, helpful and friendly, and as a result end up not properly portraying nuances, like "sometimes", "in this case", "not always", etc.
3. Humans are capable of saying "I don't know", or "I think XYZ but I'm not sure"
4. Humans convey that they aren't sure by lack of nonverbal confidence
These are differing sets of skills and issues. LLMs dont behave like humans, they don't solve things like humans, and people take what they say at face value by default.