Right, I admit that I don't know the first thing about ML, so I tried an experiment.
Consider a language with the tokens "{[()]}" and the following grammar:
S := S S | '{' S '}' | '[' S ']' | '(' S ')' | <empty>
That is, "[()]" and "[]()" are valid sequences, but "[(])" or "))))" aren't. A child would quickly figure out the grammar if presented some valid sequences.
I generated all 73206 valid sequences with 10 tokens and used it as input to the RNN text generator code at http://karpathy.github.io/2015/05/21/rnn-effectiveness/. After 500,000 iterations I'm still getting invalid sequences.
Am I doing something stupid, or is a RNN text generator weaker than a child (or a pushdown automaton)? Is GPT fundamentally more powerful than this?
GPT-3 can generate well-formed programs, so yes, it does things well beyond this complexity.
> After 500,000 iterations I'm still getting invalid sequences.
How frequently? If it's only the occasional issue it might be down to the temperature-based sampling that code uses, which means it will, with some small probability, return arbitrarily unlikely outputs.
I suspect people overestimate the intelligence because they just can't grasp how much data it's ingested or don't have a visceral sense of what an ocean of data can contain. There's a saying that "quantity has a quality all its own".
Human: I want to test your creativity. Please invent a new word and give its meaning.
GPT-3: Ok. Um... Tana means to hit someone with the intention to wound them.
Human: Please use the word tana in a sentence.
GPT-3: You are about to tana the man attacking you.
Human: Speak like a dwarf.
GPT-3: I ain't talkin' like a dwarf.
https://www.reddit.com/r/MachineLearning/comments/hvssqn/d_g...