I built a smart volume control system out of distributed Kalman filters + classic PID control to track the EBU-128 loudness envelope of an unknown sound source and attenuate the music gain to keep it at a comfortable level: https://wallfly.webflow.io/
The challenge comes when dealing with silence, or breaks in a song: if you detect silence, should the volume go up or down? Of course, the dynamics make the music and should not change, but you don't know that without access to the source signal. So you add latency to the PID controller, but then you get overshoot (classic time/accuracy trade-off).
To do perfect control you need access to the source signal, or lookahead, but you can still do a pretty good job without the source signal by capping the signal gain, i.e. only attenuate and never add amplify. Still compresses the signal, though.
In an LTI (linear time-invariant system), coherence can compare the acceleration of both input and output signals to calculate the power (but not the contents) of external signals that entered the system.
Coherence, is in my opinion, underused in industry.
The use of color is indeed fantastic. Except for choosing a black background for the graphs.
I wanted to print the webpage wihtout wasting so much ink. And a white background improves readability too.
For that I needed to learn how to manipulate the image in order to invert only the greys without altering the rest of the colors. It actually was pretty easy with Gimp (duplicate layer, choose mode: HSL Color for the upper layer, invert colors for the lower layer).
It's a great idea, but it adds on more work and it's often not clear how to do it. (I'm still not sure how to do it in Pandoc Markdown.) Anyway, more examples: https://twitter.com/gwern/status/1251320700608143360
When implementing Kalman filters a common issue is that the covariance matrix P retards to be not positive semi-definite due to numerical errors. When this happens the Kalman filter may result REALLY weird results.
There is an easy fix for this that is rarely mentioned, except one runs into this issue and googles it:
After updating (prediction and observation) P just ensure its positive semi-definiteness by averaging with its transpose:
What you have recommended will tend to stabilize the covariance matrix, but not for the reason you have given. If numerical stability is an issue you encounter, I would caution against using it. Try [1] instead.
You imply that any symmetric matrix is positive semi-definite. Take an orthogonal matrix U and a diagonal matrix D with one or more negative values.
P = U * D * U.T is symmetric but not positive semi-definite.
For example, if D[0,0] = -1 then U[:, 0].T * P * U[:,0] = -1.
> That is what happened exactly in the Apollo rocket, back in the 60s the IMUs were veeery heavy and they could only carry one.
The first and most famous application of a Kalman filter in the Apollo program was for the problem of midcourse navigation. In this application, the Apollo PGNCS used the Kalman filter to combine the calculated trajectory based on vehicle dynamics with the sextant measurements (optical starsighting). The IMU was not used for midcourse navigation.
It was all very simple and concise (thank you) until the extended kalman part where it appears you sped up by 100x and just dismissed the whole non-linear predictor saying we taylor it and create a jacobian for every instance of t.
When I used to develop Kalman filters, visualizing the covariances was the best way to understand/debug the setup. Two great libraries that I used were eigen3 (for the filter) and point cloud library (https://pointclouds.org/) for visuals.
Yes, and I love the pictures, but I wish they had a key attached, with the simple English name of each variable in the equation, because it’s no fun to hunt around a long article for the definition of a one letter variable. Common problem on Arxiv
One thing the textbooks and explainers never seem to think to do is start with the one-dimensional case. All the confronting looking matrix equations reduce to an intuitive, even obvious set of arithmetic operations.
Indeed. The simplest one-dimensional case of the Kalman filter is just exponential smoothing, which you can implement and tweak in a spreadsheet. Unfortunately the leap to the general multivariate case is still pretty big.
Equation reading speed should be much slower than explication reading speed, which is in turn normally slower than prose reading speed.
(Compare with reading music: one can derive value from passages embedded in text without being able to sight-read, but one ought to make the effort to play them nevertheless.)
Great article. I've been trying to grok the Kalman filter for a while now, definitely seems to have clicked a little more this time. BTW if the author sees this the link at the bottom 'Some credit and referral should be given to...' seems to be broken.
The challenge comes when dealing with silence, or breaks in a song: if you detect silence, should the volume go up or down? Of course, the dynamics make the music and should not change, but you don't know that without access to the source signal. So you add latency to the PID controller, but then you get overshoot (classic time/accuracy trade-off).
To do perfect control you need access to the source signal, or lookahead, but you can still do a pretty good job without the source signal by capping the signal gain, i.e. only attenuate and never add amplify. Still compresses the signal, though.
There are some clever tricks used in radar systems that you can use to estimate the noise in a room, like coherence: https://en.wikipedia.org/wiki/Coherence_(signal_processing).
In an LTI (linear time-invariant system), coherence can compare the acceleration of both input and output signals to calculate the power (but not the contents) of external signals that entered the system.
Coherence, is in my opinion, underused in industry.