There is a random matrix theory derived diagnostic of training that relies on th...

There is a random matrix theory derived diagnostic of training that relies on the spectral density of the correlation matrix of the weights. Each layer's spectral density is fit to a truncated power law, and deemed properly trained if the power law exponent alpha is just above two.

https://jmlr.org/beta/papers/v22/20-410.html