It *was* committed successfully, if there's a valid commit frame in the WAL. It ...

sethev · on Nov 9, 2024

That's true. The protocol aware recovery paper [1] talked about the challenges of disentangling corruption in the middle of the log vs uncommitted data at the end of the log. This is an issue in other log-based systems as well.

The OP made it sound like an oversight rather than an implication of assuming that the filesystem won't return corrupt data that was successfully written and fsync'ed. Sqlite is pretty upfront about the tradeoffs: https://www.sqlite.org/howtocorrupt.html#_failure_to_sync

1: https://blog.acolyer.org/2018/02/27/protocol-aware-recovery-...

Andys · on Nov 9, 2024

This is likely to happen in the case that your hardware is telling lies about when it has committed data to disk.

In this case, it makes sense to cut short the WAL application, but you are probably right in saying it should throw an error (or at least a warning).

ncruces · on Nov 9, 2024

You're right that write reordering might cause this (an invalid frame followed by a few valid frames).

That shouldn't happen if you use PRAGMA synchronous=FULL, but in WAL mode it's very common to use NORMAL.

Not sure what's the better strategy here, but I'd definitely appreciate a mode that warns me before silently truncating a WAL that has valid frames in it.