It's not novel research, but I think it drives home the point that many narrow applications of AI do not require the largest, latest (and most expensive) models. And in many of those cases, a small fine-tuned model is the most performant and cost-effective.
It is probably obvious to most who follow the space closely, but you'd be surprised how many engineers don't recognize this.
Well, one day it might be at the level of shell scripting. I don't think about "the tradeoffs of building a specialized shell script", I just do it because it's cheap and easy and solves a problem right then and there.
I don't know how you would even begin to make this kind of same observation for ML models, but seems possible. The 2010s weren't exactly building out "trivial" models, but compared to the architectures and optimizations out now, yeah those models are toy by comparison.
yes! check out https://distillabs.ai/ – follows a similar approach except the evaluation set is held out before the synthetic data generation, which I would argue makes it more robust (I'm affiliated)
It is probably obvious to most who follow the space closely, but you'd be surprised how many engineers don't recognize this.