interesting, I would argue that fine-tuning makes sense especially in cases where you want to narrow down a small model to a single task – in this case you can get the most bang-per-parameter in a way, using a small model that performs very well in a very narrow space.