> incredibly diverse, and results are going to be highly dependent on which dataset was cherry-picked for benchmarking
This naturally comes to multi-model solution under one umbrella. Sort of MoE, with selector (router, classifier) and specialized experts. If there is something which can't be handled by existing experts then train another one.
the point is it's a fundamentally flawed assumption that you can figure out which statistical model suits an arbitrary strip of timeseries data just because you've imbibed a bunch of relatively different ones.
as long as you can evaluate models' output you can select the best one. you probably have some ideas what you are looking for. then it's possible to check how likely the output is it.
the data is not a spherical horse in the vacuum. usually there is a known source which produces that data, and it's likely the same model works well on all data from that source. may be a small number of models. which means knowing the source you can select the model that worked well before. even if the data is from alien ships they are likely to be from the same civilization.
I'm not saying that it's a 100% solution, just a practical approach.
it's a practical approach to serve normalized data but monitoring systems are most valuable by making abnormal conditions inspectable. proper modeling of a system has this power
so while this seems persuasive, it's fundamentally about normal data which yields little value in extrapolation
This naturally comes to multi-model solution under one umbrella. Sort of MoE, with selector (router, classifier) and specialized experts. If there is something which can't be handled by existing experts then train another one.