Chat GPT appears to hallucinate or outright lie about everything

Buttflapper@lemmy.world · 2 months ago

Chat GPT appears to hallucinate or outright lie about everything

ipkpjersi@lemmy.ml · 2 months ago

Don’t use them for facts, use them for assisting you with menial tasks like data entry.

maniclucky@lemmy.world · 2 months ago

Best use I’ve had for them (data engineer here) is things that don’t have a specific answer. Need a cover letter? Perfect. Script for a presentation? Gets 95% of the work done. I never ask for information since it has no capability to retain a fact.

boatswain@infosec.pub · 2 months ago

This is why my most frequent use of it is brainstorming scenarios for my D&D game: it’s really good at making up random bullshit.

Blackdoomax@sh.itjust.works · 2 months ago

It struggles to make more than 3 different bedtime stories in a row for my son, and they are always badly written, especially the conclusion that is almost always the same. But at least their sillyness (especially Gemini) is funny.

boatswain@infosec.pub · 2 months ago

I absolutely agree that it can’t create finished content of any particular value. For my D&D use case, its value is instead as a brainstorming tool; it can churn out enough ideas quickly enough that it’s easy for me to find a couple of gems that I can polish up into something usable.

Christer Enfors@lemm.ee · 2 months ago

Yes. I’ve experimented with this too. This is the perfect use case for LLMs - there are no wrong answers, the LLM should just make something up, which is what it does.

ipkpjersi@lemmy.ml · edit-2 2 months ago

I could have said anything, and then it would have agreed with me

Nope, I’ve had it argue with me, and I kept arguing my point but it kept disagreeing, then I realized I was wrong. I felt stupid but I learned from it.

It doesn’t “know” anything but that doesn’t mean that it can’t be right.

Red_October@lemmy.world · 2 months ago

Yeah? That’s… how LLMs work. It doesn’t KNOW anything, it’s a glorified auto-fill. It knows what words look good after what’s already there, it doesn’t care whether anything it’s saying is correct, it doesn’t KNOW if it’s correct. It doesn’t know what correct even is. It isn’t made to lie or tell the truth, those concepts are completely unknown to it’s function.

LLMs like ChatGPT are explicitly and only good at composing replies that look good. They are Convincing. That’s it. It will confidently and convincingly make shit up.

ngwoo@lemmy.world · 2 months ago

OP those minimum requirements are taken directly from the Meta Quest 3 support page.

Wren@lemmy.dbzer0.com · 2 months ago

Ok? I feel like people don’t understand how these things work. It’s an LLM, not a superintelligent AI. It’s not programmed to produce the truth or think about the answer. It’s programmed to paste a word, figure out what the most likely next word is, paste that word, and repeat. It’s also programmed to follow human orders as long as those order abide by its rules. If you tell it the sky is pink, then the sky is pink.

SPRUNT@lemmy.world · 2 months ago

Current AI is a glorified predictive text keyboard.

Wren@lemmy.dbzer0.com · 2 months ago

Exactly, it’s not something designed to output facts, it’s designed to output the most likely set of words.

SuperSleuth@lemm.ee · 2 months ago

There’s no way they used Gemini and decided it’s better than GPT.

I asked Gemini: “Why can great apes eat raw meat but it’s not advised for humans?”. It said because they have a “stronger stomach acid”. I then asked “what stomach acid is stronger than HCL and which ones do apes use?”. And was met with the response: “Apes do not produce or utilize acids in the way humans do for chemical processes.”.

So I did some research and apes actually have almost neutral stomach acid and mainly rely on enzymes. Absolutely not trustworthy.

Daemon Silverstein@thelemmy.club · 2 months ago

use

I guess Gemini took the word “use” literally. Maybe if the word “have” would be used, it’d change the output (or, even better, “and which ones do apes’ stomachs have?” as “have” could imply ownership when “apes” are the subject for the verb).

thedeadwalking4242@lemmy.world · 2 months ago

You asked a generic machine a generic question and it gave you an extremely generic response. What did you expect? There was no context. It should have asked you more questions about what you’ll be doing.

ITGuyLevi@programming.dev · 2 months ago

You’re taking the piss right? Those seem like perfectly reasonable responses.

What video card is required to use it? None, it can be used standalone.

What video card to use it streaming from your PC, at least a 580 sounds okay for some games. You seem to be expecting it to lie, and then inferring truthful information as a lie because the information you held back (which game you want) is the reason for the heavier video card requirement.

linearchaos@lemmy.world · 2 months ago

I don’t want to sound like an AI fanboy but it was right. It gave you minimum requirements for most VR games.

No man Sky’s minimum requirements are at 1060 and 8 gigs of system RAM.

If you tell it it’s wrong when it’s not, it will wake s*** up to satisfy your statement. Earlier versions of the AI argued with people and it became a rather sketchy situation.

Now if you tell it it’s wrong when it’s wrong, It has a pretty good chance of coming back with information as to why it was wrong and the correct answer.

VinS@sh.itjust.works · 2 months ago

Well I asked some questions yesterday about classes of DAoC game to help me choose a starter class. It totally failed there attributing skills to wrong class. When poking it with this error it said : you are right, class x don’t do Mezz, it’s the speciality of class Z.

But class Z don’t do Mezz either… I wanted to gain some time. Finally I had to do the job myself because I could not trust anything it said.

linearchaos@lemmy.world · 2 months ago

God I loved DAoC, Play the hell of it back in it’s Hey Day.

I can’t help but think it would have low confidence on it though, there’s going to be an extremely limited amount of training data that’s still out there. I’d be interested in seeing how well it fares on world of Warcraft or one of the newer final fantasies.

The problem is there’s as much confirmation bias positive is negative. We can probably sit here all day and I can tell you all the things that it picks up really well for me and you can tell me all the things that it picks up like crap for you and we can make guesses but there’s no way we’ll ever actually know.

VinS@sh.itjust.works · 2 months ago

I like it for brainstorming while debbuging, finding funny names, creating stories “where you are the hero” for the kids or things that don’t impact if it’s hallucinating . I don’t trust it for much more unfortunately. I’d like to know your uses cases where it works. It could open my mind on things I haven’t done yet.

DAoC is fun, playing on some freeshard (eden actually, started one week ago, good community)

cheddar@programming.dev · 2 months ago

It’s incorrect to ask chatgpt such questions in the first place. I thought we’ve figured that out 18 or so months ago.

ABCDE@lemmy.world · 2 months ago

Why? It actually answered the question properly, just not to the OP’s satisfaction.

ramirezmike@programming.dev · 2 months ago

because it could have just as easily confidentiality said something incorrect. You only know it’s correct by going through the process of verifying it yourself, which is why it doesn’t make sense to ask it anything like this in the first place.

ABCDE@lemmy.world · 2 months ago

I mean… I guess? But the question was answered correctly, I was playing Beat Saber on my 1060 with my Vive and Quest 2.

ramirezmike@programming.dev · 2 months ago

It doesn’t matter that it was correct. There isn’t anything that verifies what it’s saying, which is why it’s not recommended to ask it questions like that. You’re taking a risk if you’re counting on the information it gives you.

vxx@lemmy.world · 2 months ago

I think we shouldn’t expect anything other than language from a language model.

gravitas_deficiency@sh.itjust.works · 2 months ago

The “i” in LLM stands for intelligence

webghost0101@sopuli.xyz · 2 months ago

This is an issue with all models, also the paid ones and its actually much worse then in the example where you at least expressed not being happy with the initial result.

My biggest road block with AI is that i ask a minor clarifying question. “Why did you do this in that way?” Expecting a genuine answer and being met with “i am so sorry here is some rubbish instead. “

My guess is this has to do with the fact that llms cannot actually reason so they also cannot provide honest clarification about their own steps, at best they can observe there own output and generate a possible explanation to it. That would actually be good enough for me but instead it collapses into a pattern where any questioning is labeled as critique with logical follow up for its assistant program is to apologize and try again.

Tellore@lemmy.world · 2 months ago

I’ve also had similar problem, but the trick is if you ask it for clarifications without it sounding like you imply them wrong, they might actually try to explain the reasoning without trying to change the answer.

webghost0101@sopuli.xyz · 2 months ago

I have tried to be more blunt with an underwhelming succes.

It has highlighted some of my everyday struggles i have with neurotypicals being neurodivergent. There are lots of cases where people assume i am criticizing while i was just expressing curiosity.

Dasus@lemmy.world · 2 months ago

“Converted what I said into the truth”

Now I’m not against the point you’re making in any way, I think the bots are hardcore yes men.

Buut… I have a 1060 and I got it around when No Man’s Sky came out, and I did try it on my 4k LED TV. It did run, but it also stuttered quite a bit.

Now I’m currently thinking of updating my card, as I’ve updated the rest of the PC last year. A 3070 is basically what I’m considering, unless I can find a nice 4000 series with good VRAM.

My point here being that this isn’t the best example you could have given, as I’ve basically had that conversation several times in real life, exactly like that, as “it runs” is somewhat subjective.

LLM’s obviously have trouble with subjective things, as we humans do too.

But again, I agree with the point you’re trying to make. You can get these bots to say anything. It amused me that the blocks are much more easily circumvented just by telling them to ignore something or by talking hypothetically. Idk but at least very strong text based erotica was easy to get out of them last year, which I think should not have been the case, probably.