Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not novel research, but I think it drives home the point that many narrow applications of AI do not require the largest, latest (and most expensive) models. And in many of those cases, a small fine-tuned model is the most performant and cost-effective.

It is probably obvious to most who follow the space closely, but you'd be surprised how many engineers don't recognize this.



It's a matter of ROI: When is it worth it to build something specialized?


Well, one day it might be at the level of shell scripting. I don't think about "the tradeoffs of building a specialized shell script", I just do it because it's cheap and easy and solves a problem right then and there.

I don't know how you would even begin to make this kind of same observation for ML models, but seems possible. The 2010s weren't exactly building out "trivial" models, but compared to the architectures and optimizations out now, yeah those models are toy by comparison.


Is anybody working on making building specialized things easier and cheaper?


Yes! At https://RunRL.com we offer hosted RL fine-tuning, so all you need to provide is a dataset and reward function or environment.


yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: