Will it improve though? I’m not a GPT hater or denier, but how do you even predi...

xherberta · on Feb 4, 2023

> What if a learning dataset simply has not enough info for a correct answer to have a greater weight than all the “con” noise?

Indeed. I wonder what happens as available training data shifts from purely human-generated (now) to largely AI-generated (soon). Is this an information analogue to the “gray goo” doomsday that an uncontrolled self-replicating nano device could cause?

>can it answer “I don’t know this” Such a fabulous question. This statement likely appears infrequently in the training data.

fennecfoxy · on Feb 7, 2023

>can it answer “I don’t know this” Afaik this is one of the more newer ways of training ML models, I've been looking into using it myself for a few things.

A lot of models were trained to provide some quantifiable output 100% of the time, even if that output was wrong. Ie image recognition models "82.45% certain that is a dog", whereas it makes _all_ the difference for it to be able to say "82.42% certain that is a dog and 95.69% certain I don't know what that is" to indicate that the image has many features of a dog, but not enough for it to be more certain that it is a dog than isn't. It's the negative test problem I guess; us devs often forget to do it too.

In a way I wonder if that's how some of the systems in our brains work as well; ie we evolved certain structures to perform certain tasks, but when those structures fail to determine an action, the "I don't know" from that system can kick back into another. Thing like the fear response: brain tries to identify dark shadow & can't, kicks back to evolutionary defence mechanisms of be scared/cautious feel fear as it's saved the skins of our forebears.

shmoogy · on Feb 4, 2023

Isn't that what the thumbs up/down are for? Some kind of annotating that can be used to improve future iterations of training ? They've got millions of people feeding potentially billions of queries, probably tons of feedback - would this not result in an improvement over time?

wruza · on Feb 4, 2023

Assuming that the existing corpus was already coherent with what experts find true (afaik, they used all available books and common knowledge resources), why would any amount of additional corrective statements make a difference for a retrained model? It’s not that our written knowledge was wrong all the time and we tolerated it until mid 2022.

I don’t really understand how it works, how its iterations are different or what the roadmap is. But what I managed to learn (better say feel) about LLMs isn’t very consistent with such linear predictions.

Well, maybe it will use downvotes as anti-prompts? Existing sources must have had votes too, but it was probably only a subset. Maybe the current iteration didn’t rank by vote at all, so the next one will really shine? Guess we’ll see soon.