Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Expanding on this history for readers here who are just catching up on this stuff, modern feedforward NNs descended from the Perceptron paper by Rosenblatt and friends from 1957. This was the first paper that really tried to position "learning machines" based on accumulated stimulus activations, and it built off of a previous paper from 1943, "A Logical Calculus of Ideas Immanent in Nervous Activity." So yes it's true that the field borrowed ideas from the brain and it was positioned that way from the start, though Perceptrons totally weren't a very similar implementation to what Nature herself designs...

A Perceptron model is just a dot product: one multiply+add, summed to one output scalar value. Modern nomenclature might consider Perceptrons to just be "one channel of a single layer," or like "one fully-connected layer with a single output." In fact, that's where the old-fashioned name for neural networks came from: "Multi-layer Perceptrons (MLPs)" are a bunch of these arranged on top of each other. The older 1943 work showed that in some sense lots of possible arrangements are roughly equivalent to each other so it made sense for Rosenblatt and friends to start from the simplest model with the understanding that the field would build from there. (Our feedforward model ancestry is just one branch of the tree rooted in this work; there's a completely separate geneology for other possible arrangements of modeling components, like recurrent Boltzmann machines that simply haven't been so well studied/understood... :-)

The generality of Perceptrons made room for a lot of interpretive flexibility. Right from the start, news reporters saw the theory, attended academic conferences, and started writing about it. The NYTimes had an article in 1958 that spoke of the Perceptron as "the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence."

In fact there was a very famous kerfuffle between Rosenblatt (Perceptron authors) and Minsky & Papert (another AI researchers). The latter two published a spicy book in 1969 saying that a single Perceptron wasn't general enough because linear models can't even learn an XOR function. Everyone who read this book misinterpreted it as a takedown of Perceptrons; this book inadvertantly dried up interest, choked funding across the board, and ultimately caused the first AI winter which didn't get resolved until well into the '80s when these ideas started to be revitalized. You can read about the spilled tea here, it's quite fascinating: https://doi.org/10.1177%2F030631296026003005



I think the cause of the first AI winter had more to do with some of the early researchers suggesting they'd have solutions to problems within a few months to years that are just now becoming tractable. In retrospect it doesn't seem like they could've gone much faster than they did. New ideas mattered but increased computing speeds probably mattered more. It's not like we would have been able to get dramatically better results than those researchers did by taking what we know now and running it on the same hardware they had back then.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: