Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Tldr: the author is annoyed at the Bitter Lesson.

Join the crowd dude. It's still true, no matter how inconvenient it is.

 help



I suspect his diagnostic is pretty accurate, though. The bitter lesson came up when deep learning was already mainstream. The text discusses how that happened, and it can be the case that convenience beats accuracy. Accuracy is an epistemic value, but current AI is largely driven by market values. If accuracy manages to get along, great, but other than that, market-laden convenience reigns. Commercially, it is often more convenient to even change the world in order to make it easier for our models (consider how we're willing to create special places without pedestrians or human-driven vehicles for autonomous vehicles as a "solution" for their shortcomings).

This means money beats math?

It means trying to figure out how to build an intelligence always loses to mindlessly brute-forcing problems with more compute:

https://en.wikipedia.org/wiki/Bitter_lesson


It's not mindless brute-forcing, the details of the architecture, data, and training strategy still matter a lot (if you gave a modern datacenter to an AI researcher from the 60s they wouldn't get an LLM very quickly). The bitter lesson is that you should focus on adjusting your techniques so that they can take advantage of processing power to learn more about your problem themselves, instead of trying to hand-craft half the solution yourself to 'help' the part that's learning.

Well, it means that thus far trying to build an intelligence has lost out to brute forcing it with more compute.

There is nothing particular that suggests this is infinitely scalable.


unless you don't have unlimited compute, at which point you need other ideas

https://arielche.net/bitter-lesson


Then train your model elsewhere and size it as appropriate for the runtime environment.

If that really isn't an option, then yes ML/AI isn't for you in this case.


I found this article a little weak, but there is an interesting parallel.

The 10,000 hours thing is encouraging because the amount of effort you put in as far more important than your natural ability.

... Until you get to the point where everyone is already working as hard as humanly possible, at which point natural ability becomes the sorting function again.


They have researchers working for insane salaries just so they don't go to another frontier lab to share their ideas. If you think it is just "mindless bruteforce" you don't understand anything. The idea is that the most effective methods are ones that scale but those ideas are also then limited by the compute available.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: