You can make the argument that stagnant progress isn’t actually not progress, wh...

SemanticStrengh · on May 2, 2022

How much more parameters do you need? PALM is 530 BILLIONs and underperform in NLP tasks vs XLnet (300 millions), as such very large language model are extreme failures. They do not improve the state of the art once you have proper datasets and do full shot learning and I'm not even talking about fine-tuning.

Very large languages model hide to the layman that they are the gigantesque failure in NLP ever by showing they improve the state of the art in zero or few shot learning. Who cares this is so cringe. Full size learning is what matter the most and even full size learning do not yield satisfying accuracy on most Nlp tasks (but close enough) Therefore the only use of PALM is to have mediocre (70-80%) accuracy which is better than previous SOTA, only for tasks that have no good quality existing datasets. And 530 billion is close to the max we can realistically achieve, it already cost ~10 millions in hardware and underperform a 300 million model in full size learning (e.g dependency parsing, word sense disambiguation, coreference resolution, NER, etc)

It's crazy people don't realize this gigantic failure but as always it's because they don't care enough

shyish · on May 2, 2022

Out of curiosity, what was the video?

francisdrake · on May 3, 2022

https://youtu.be/RJwPN4qNi_Y