Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's ridiculous to equate this to a traditional Markov chain language model. Here's something a Markov chain certainly cannot do:

Human: I want to test your creativity. Please invent a new word and give its meaning.

GPT-3: Ok. Um... Tana means to hit someone with the intention to wound them.

Human: Please use the word tana in a sentence.

GPT-3: You are about to tana the man attacking you.

Human: Speak like a dwarf.

GPT-3: I ain't talkin' like a dwarf.

https://www.reddit.com/r/MachineLearning/comments/hvssqn/d_g...



Right, I admit that I don't know the first thing about ML, so I tried an experiment.

Consider a language with the tokens "{[()]}" and the following grammar:

S := S S | '{' S '}' | '[' S ']' | '(' S ')' | <empty>

That is, "[()]" and "[]()" are valid sequences, but "[(])" or "))))" aren't. A child would quickly figure out the grammar if presented some valid sequences.

I generated all 73206 valid sequences with 10 tokens and used it as input to the RNN text generator code at http://karpathy.github.io/2015/05/21/rnn-effectiveness/. After 500,000 iterations I'm still getting invalid sequences.

Am I doing something stupid, or is a RNN text generator weaker than a child (or a pushdown automaton)? Is GPT fundamentally more powerful than this?


GPT-3 can generate well-formed programs, so yes, it does things well beyond this complexity.

> After 500,000 iterations I'm still getting invalid sequences.

How frequently? If it's only the occasional issue it might be down to the temperature-based sampling that code uses, which means it will, with some small probability, return arbitrarily unlikely outputs.


How can it do that? Did it read “tana” and the meaning somewhere?


I suspect people overestimate the intelligence because they just can't grasp how much data it's ingested or don't have a visceral sense of what an ocean of data can contain. There's a saying that "quantity has a quality all its own".


I don't think this is an overestimation of intelligence. That ability is itself intelligence.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: