Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are no emergent behaviors; LLMs are essentially memorizing statistical patterns in the data and using lexical cues to generate responses. They cannot explain themselves reliably because they don't know what they know, nor do they know what they don't know. In fact, LLMs don't truly 'know' anything at all. These are not thinking machines—they are simply the result of statistical pattern matching on steroids. That's why there will be no AGI, at least not from LLMs: https://www.lycee.ai/blog/why-no-agi-openai


Your post isn't providing any argument, you're just providing statement after statement of a position that is pretty well known (no emergent behavior, just pattern matching) and pretty popular here on hackernews. Maybe the link provides all the good stuff, but I'd suggest you at least provide some flavor of what your arguments are.


Each argument could easily fill a book. Some valuable insights are discussed in the linked article, but most of the evidence comes from a series of research papers demonstrating that LLMs perform poorly when confronted with out-of-training distribution prompts. For example, even state-of-the-art LLMs can fail simple tasks like counting the number of "r" letters in the word "strawberry" or fall victim to the reversal curse (e.g., LLMs trained on "A is B" often struggle to deduce "B is A"). Another interesting read on this topic comes from MIT's research, which further explores these limitations: (https://news.mit.edu/2024/reasoning-skills-large-language-mo...)


> or fall victim to the reversal curse (e.g., LLMs trained on "A is B" often struggle to deduce "B is A").

But should "Joe is human" logically imply that "human is Joe"?


The reversal curse is mostly about understanding specific logical relationships rather than identity statements. It's not about statements like 'Joe is human' implying 'human is Joe,' but rather about LLMs struggling to reverse relationships or roles, such as turning 'A is B' into 'B is A.' Check out the summary of the paper: "We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A". This is the Reversal Curse. For instance, if a model is trained on "Valentina Tereshkova was the first woman to travel to space", it will not automatically be able to answer the question, "Who was the first woman to travel to space?". Moreover, the likelihood of the correct answer ("Valentina Tershkova") will not be higher than for a random name. Thus, models do not generalize a prevalent pattern in their training set: if "A is B" occurs, "B is A" is more likely to occur. It is worth noting, however, that if "A is B" appears in-context, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as "Uriah Hawthorne is the composer of Abyssal Melodies" and showing that they fail to correctly answer "Who composed Abyssal Melodies?". The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation. We also evaluate ChatGPT (GPT-3.5 and GPT-4) on questions about real-world celebrities, such as "Who is Tom Cruise's mother? [A: Mary Lee Pfeiffer]" and the reverse "Who is Mary Lee Pfeiffer's son?". GPT-4 correctly answers questions like the former 79% of the time, compared to 33% for the latter. Code available at: this https URL."


I’d like you to explain what you don’t know, and why.


I don't know the history of the original tribes in Brazil. Why? Because I haven't studied it, read about it, or been taught anything on the subject. I know that such tribes exist, but I don't know their history.


FWIW, lately prompts seem more able to elict that same sort of answer from both Claude and GPT-4x on thinly learned contexts.


And, next release, they'll have had been trained on one more example of an answer for that sort of question ;)


Your post now will probably join the mix. So will mine and all the others here.


We can consider any question and conclude that we don’t know the answer. LLMs can’t meaningfully do that. They just make stuff up.


And yet I can provide modern LLMs unique never before asked scenarios that require the ability to reason about real world phenomenon and they can respond in ways more thoughtful than the average person.

Much of human education is feeding books of information about things we will never experience in our day-to-day lives, and convincing ourselves it reflects reality. When in fact most of us have not personally experienced any evidences that what we learned is true.

That vast majority of what a person "knows" is a biological statistical pattern matching on steroids.


What you're describing is no different than a linear regression describing the predicted value between two data points. Sometimes the regression is close if enough data exists, other times you get wild hallucinations, with the model none the wiser whether it's correct. All this tells us is that there is a lot of data out there on the internet that can still have useful information extracted from it.

Take someone like Ramanujan, who with a couple math books on his own, could derive brilliant and novel discoveries in mathematics, instead of needing millions of man hours worth reading material to replicate what is mostly a replacement for googling.


Don’t be hoodwinked by a plausibility engine. The central dogma of oracle-type LLMs is that plausibility converges towards accuracy as scale increases. This hypothesis remains very far from proven.


> And yet I can provide modern LLMs unique never before asked scenarios that require the ability to reason about real world phenomenon and they can respond in ways more thoughtful than the average person.

Give me an example of that


One example is to provide it a list of objects and ask it what would be the most plausible way to stack them without damaging the objects while also being stable. Optionally, you can ask the model to generate a rational for the solution provided.

You can even invent a fictious object that has never existed, define it's properties, and ask the model to include it in the list.


Sure the question may be unique, but you said unique scenario - this scenario is very well trained for.


In what way is it very well trained for?


you have a lot of resources about similar problems on the internet. so LLMs will have some patterns to leverage even if it is not even sure if you will get correct answers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: