> By contrast a well-educated renaissance man might have read a book every month or so from age 15 to 30? That doesn't seem to be anywhere near what GPT could swallow in a few seconds.
You're ignoring the insane amount of sensory information a human gets in 30 years. I think that absolutely dwarfs the amount of information that GPT-3 eats in a training run.
But that sensory information includes very few written words. GPT(n) isn't being trained on "worldly audio data", or "worldly tactile data", or in fact any sensory data at all.
So the two training sets are completely orthogonal, and the well educated renaissance man is somehow able to take a very small exposure to written words and do at least as well as GPT(n) in processing them and responding.
And the renaissance man has tons of structure encoded in his brain on birth already. Just like GPT-3 does before you give it a prompt. I'm not saying this is fully equivalent (clearly a baby can't spout correct Latex just by seeing three samples), but you simply cannot just handwave away thousands of years of human evolution and millions of years of general evolution before that.
The renaissance man is very obviously not working solely based on a few years of reading books (or learning to speak/write).
A person who is never taught to read will never be able to respond to written text. So the renaissance-era man is working "solely" based on their lived experience with text, which compared to GPT(n) is tiny.
Ah! you cry. Don't humans have some sort of hard-wiring for speech and language? Perhaps. But it is clearly completely incapable of enabling an untrained human to deal with written text. Does it give the human a head start in learning to deal with written text? Perhaps (maybe even probably). It demonstrably takes much less training than GPT(n) does.
But that is sort of the point of the comment at the top of this chain.
That's an interesting point. I'm not sure how to measure that though. Also my guess is we have the sensors on for part of the day only, plus that's filtered heavily by your attention process, eg you can't read two books at once.
Yeah. A dog never read a book and only has rudimentary understanding of language (if any - no idea, maybe they just pattern match cause and effect) but a dog-level AI would be incredibly valuable. And you can train a dog to be fairly competent in a task in less than a year.
Written language is used, in large part, to express sensory data (ex: colors, shapes, events, sounds, temperatures, etc). Abstract models are, through inductive reasoning, extrapolated from that sensory information. So in effect more sensory data should mean more accurate abstract models.
For example, it might take several paragraphs to wholly capture all the meaningful information in one image in such a way that it can be reproduced accurately. Humans, and many animals, process large amounts of data before they are even capable of speech.
The data GPT-3 was provided with pales in comparison. It is unclear whether these GPT models are capable of induction because it may be that they need more or better sanitised data to develop abstract models. Therefore they should be scaled up further until they only negligepbly improve. If even then they, still, are incapable of general induction or have inaccurate models. Then the transformer model is not enough or perhaps we need a more diverse set of data (images, audio, thermosensors, etc).
You're ignoring the insane amount of sensory information a human gets in 30 years. I think that absolutely dwarfs the amount of information that GPT-3 eats in a training run.