sustained reads would not even give 1 mio iops in that case. Maybe wen you only read the same file that fits into the nvme cache. Which probably never happens in a production database..
I think you'd be surprised. Sustained write performance has gotten pretty good. Decent but not fancy consumer drives will often do 1GBps sustained, for bulkier writes. That's much better than we used to expect: flash has gotten much better with so many layers! This mid-range PCIe5 drive sustains a nice 1.5GBps: https://www.techpowerup.com/review/team-group-ge-pro-2-tb/6....
I don't think sustained reads are a problem? Benches like the CrystalDiskMark do a full disk random read test; they're designed to bust through cache afaik. 7.2GBp of 4k reads would translate to 1.8MIOps. Even if this is massively optimistic, you need to slash a lot of zeroes/orders of magnitude to get down to 20kIOps, which you will also pay >$100/mo for.
Samsung 9910 has a 1:1 TB:GB cache size of LPDDR4X memory. I won't pretend to understand the magic NVMe drives possess, but if you got a 4TB or 8TB 9910, could you not in theory pull in all of the data you require to cache?
I would assume, and it might be a poor assumption, that NVMe controllers don't pull in files, but rather blocks, so even if you had a database that exceeded cache size, in theory if the active blocks of that database did not exceed cache size, it could be "indefinitely" cached for a read-only pattern.
The DRAM on a SSD like that isn't for caching user data, it's for caching the drive's metadata about which logical blocks (as seen by the OS) correspond to which physical locations in the flash memory.
In the face of sustained writes? For how long?