Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Will it improve though? I’m not a GPT hater or denier, but how do you even predict that it hasn’t already hit the wall? They can increase parameter count x100 again, but correctness is not some knob they can just dial up to 10. What if a learning dataset simply has not enough info for a correct answer to have a greater weight than all the “con” noise? What if an answer requires a sort of reasoning inaccessible to LLMs?

Stories itt can as well be boiled down to “I fed it with corrections for some time and it didn’t f..k up this last time and finally included everything into the answer”. What makes you think it would not do just that better or quicker?

Edit: Another probably highly related question is, can it answer “I don’t know this / not sure about these parts”? Never seen that in chat logs.



> What if a learning dataset simply has not enough info for a correct answer to have a greater weight than all the “con” noise?

Indeed. I wonder what happens as available training data shifts from purely human-generated (now) to largely AI-generated (soon). Is this an information analogue to the “gray goo” doomsday that an uncontrolled self-replicating nano device could cause?

>can it answer “I don’t know this” Such a fabulous question. This statement likely appears infrequently in the training data.


>can it answer “I don’t know this” Afaik this is one of the more newer ways of training ML models, I've been looking into using it myself for a few things.

A lot of models were trained to provide some quantifiable output 100% of the time, even if that output was wrong. Ie image recognition models "82.45% certain that is a dog", whereas it makes _all_ the difference for it to be able to say "82.42% certain that is a dog and 95.69% certain I don't know what that is" to indicate that the image has many features of a dog, but not enough for it to be more certain that it is a dog than isn't. It's the negative test problem I guess; us devs often forget to do it too.

In a way I wonder if that's how some of the systems in our brains work as well; ie we evolved certain structures to perform certain tasks, but when those structures fail to determine an action, the "I don't know" from that system can kick back into another. Thing like the fear response: brain tries to identify dark shadow & can't, kicks back to evolutionary defence mechanisms of be scared/cautious feel fear as it's saved the skins of our forebears.


Isn't that what the thumbs up/down are for? Some kind of annotating that can be used to improve future iterations of training ? They've got millions of people feeding potentially billions of queries, probably tons of feedback - would this not result in an improvement over time?


Assuming that the existing corpus was already coherent with what experts find true (afaik, they used all available books and common knowledge resources), why would any amount of additional corrective statements make a difference for a retrained model? It’s not that our written knowledge was wrong all the time and we tolerated it until mid 2022.

I don’t really understand how it works, how its iterations are different or what the roadmap is. But what I managed to learn (better say feel) about LLMs isn’t very consistent with such linear predictions.

Well, maybe it will use downvotes as anti-prompts? Existing sources must have had votes too, but it was probably only a subset. Maybe the current iteration didn’t rank by vote at all, so the next one will really shine? Guess we’ll see soon.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: