Right, I’m not the most well read on this stuff either, so I’m wondering now if ...

Right, I’m not the most well read on this stuff either, so I’m wondering now if existing architectures operate on this

> efficient frontier of parsimony against say time complexity.

As you mentioned before regularization approximates parsimony, could it be that what’s gained from this loss of precision wrt parsimony are runtime guarantees (since now we’re mostly talking about constant depth circuit-esque DL architectures)? Or is the jump to continuous spaces more relevant? Are these the same?

I’ll have to read up more to see