No matter how fast CRC32 is, you still have to transfer the data from disk to cpu. I suspect reading the entire db would introduce unacceptable latency & i/o strain in many usecases for sqlite.
Pages?. Postgres does that apparently if enabled. And when a page is read, it is already read… The server just checks the checksum. Only overhead should be recalculating the checksum and validating it.
Now, in case a bit flips somewhere you do not touch often, it would probably keep chugging along without noticing that. Which kind of makes sense?
All of it. With a checksums-of-checksums scheme like a Merkele tree, you can effectively and efficiently checksum all the data and keep incremental changes cheap. You only need to update the checksums of the data blocks you touched and their ancestor nodes in the tree.
I use sqlite for some "logging-like" thing a lot, the file is in the ~1.5GB range and growing, and every minute some data is logged to it. Having to read 1.5GB of data from the disk every minute, to add a (few times) timestamp and one 64 bit number to that data seems pointless.
If you already know the checksum for some huge chunk of the message then you don't need to recompute it to append some data and get a new checksum (at least for CRC). On the read side you would want to have checksums at whatever granularity you want to be able to read but for a larger combined CRC checksum you don't need to ever reread data to append or prepend to it.
No matter how fast CRC32 is, you still have to transfer the data from disk to cpu. I suspect reading the entire db would introduce unacceptable latency & i/o strain in many usecases for sqlite.