I attempted to preempt your question in the section of my blog post, "Why don’t ...

I attempted to preempt your question in the section of my blog post, "Why don’t you just pipeline the GPU code so that it saturates the GPU?" It's one of the less-detailed sections though so maybe you have further questions? I think the main thing is that since Anukari processes input like MIDI and audio data in real-time, it can't work ahead of the CPU, because those inputs are not available yet.

Possibly what you describe is a bit more like double-buffering, which I also explored. The problem here is latency: any form of N-buffering introduces additional latency. This is one reason why some gamers don't like triple-buffering for graphics, because it introduces further latency between their mouse inputs and the visual change.

But furthermore, when the GPU clock rate is too low, double-buffering or pipelining don't help anyway, because fundamentally Anukari has to keep up with real time, and every block it processes is dependent on the previous one. With a fully-lowered GPU clock, the issue does actually become one of throughput and not just latency.