There's nuance to this; the deletes / overwrites are accomplished by bulk wiping...

slaymaker1907 · on June 11, 2021

Just to add to the explanation, SSDs are able to do this because they have a layer of indirection akin to virtual memory. This means that what your OS thinks is byte 800000 of the SSD may change it's actual physical location on the SSD over time even in the absence of writes or reads to said location.

This is a very important property of SSDs and is a large reason why log structured storage is so popular in recent times. The SSD is very fast at appends, but changing data is much slower.

hinkley · on June 11, 2021

> The SSD is very fast at appends, but changing data is much slower.

No, it's worse than that. The fact that it's an overly subtle distinction is the problem.

SSDs are fast while write traffic is light. From an operational standpoint, the drive is lying to you about its performance. Unless you are routinely stress testing your system to failure, you may have a very inaccurate picture of how your system performs under load, meaning you have done your capacity planning incorrectly, and you will be caught out with a production issue.

Ultimately it's the same sentiment as people who don't like the worst-case VACUUM behavior of Postgres - best-effort algorithms in your system of record make some people very cranky. They'd rather have higher latency with a smaller error range, because at least they can see the problem.

hawski · on June 11, 2021

Are there write-once SSDs? They would have a tremendous capacity. Probably good for long term backups or archiving. Also possibly with a log structured filesystem only.

eqvinox · on June 11, 2021

Making them write-once doesn't increase the capacity; that's mostly limited by how many analog levels you can distinguish on the stored charge, and how many cells you can fit. The management overhead and spare capacity to make SSDs rewritable is –to my knowledge– in the single digit percentages.

(Also you need the translation layer even for write-once since flash generally doesn't come 100% defect free. Not sure if manufacturers could try to get it there, but that'd probably drive the cost up massively. And the translation layer is there for rewritable flash anyway... the cost/benefit tradeoff is in favor of just living with a few bugged cells.)

entangledqubit · on June 11, 2021

I suspect that hawki was assuming that a WORM SSD would be based on a different non-flash storage medium. I don't know any write once media that has similar read/write access times to an SSD.

FWIW, there are WORM microsd cards available but it looks like they still use flash under the hood.

hawski · on June 11, 2021

I don't know enough specifics, so I didn't assume anything :) In fact I was not aware of non-flash SSDs.

Because of the Internet age there probably is not much place for write once media anyway, even it would be somewhat cheaper. But maybe for specialized applications or if it would be much much cheaper per GB.

wongarsu · on June 11, 2021

The only write once media I'm aware of that is in significant use are WORM tapes. They don't offer significant advantages over regular tapes, but for compliance reasons it can be useful to just make it impossible to modify the backups.

Pet_Ant · on June 12, 2021

What about EPROMs? I mean could those be scaled down with 7nm lithography to be energy efficient incorruptible fast storage?

AstralStorm · on June 12, 2021

You mean the UV erasable kind? Essentially phase change memory? Very hard to miniaturize?

Because the older Flash aren't as stable when miniaturized as you'd expect. Current flash is a direct descendant of these, they only are more stable because the cells are much chunkier and thus with lower leakage.

Pet_Ant · on June 14, 2021

I was thinking of the anti-fuse based PROMs not EPROMs sorry. I figure if you miniatures those they'd be faster and denser and use-based fully reliable.

entangledqubit · on June 14, 2021

I thought along that route as well but I'm not sure how the feature scale of a fuse compares to the size of a flash cell - especially since the latter can contain multiple bits worth of info (MLC). Assuming the fuse write results in a serious physical state change of some sort, I suspect that the energy required for high speed writes (at SSD speeds) may become substantial.

That being said, it's not clear how innovation has occurred in this direction in the storage space.

jonny_eh · on June 11, 2021

> Making them write-once doesn't increase the capacity

It could theoretically make them cheaper. But I guess that there wouldn't be enough demand, so you'd be better off having some kind of OS enforced limitation on it.

weinzierl · on June 11, 2021

I find this a super interesting question. I always assumed that long term stability of electronic non-volatile memory is worse than that of magnetic memory. When I think about it, I can't think of any compelling reason why that should be the case. Trapped electrons vs magnetic regions; I have no intuition which one of them is likely to be more stable.

There is a question on stackoverflow about this topic with many answers but no definitive conclusion. There seem to be some papers touching the subject but at a glance I couldn't find anything useful in them.

[1] https://superuser.com/questions/4307/what-lasts-longer-data-...

olejorgenb · on June 11, 2021

According to https://www.ni.com/en-no/support/documentation/supplemental/... (Seems kinda reputable at least)

"The level of charge in each cell must be kept within certain thresholds to maintain data integrity. Unfortunately, charge leaks from flash cells over time, and if too much charge is lost then the data stored will also be lost.

During normal operation, the flash drive firmware routinely refreshes the cells to restore lost charge. However, when the flash is not powered the state of charge will naturally degrade with time. The rate of charge loss, and sensitivity of the flash to that loss, is impacted by the flash structure, amount of flash wear (number of P/E cycles performed on the cell), and the storage temperature. Flash Cell Endurance specifications usually assume a minimum data retention duration of 12 months at the end of drive life."

wtallis · on June 12, 2021

> During normal operation, the flash drive firmware routinely refreshes the cells to restore lost charge. However, when the flash is not powered the state of charge will naturally degrade with time.

You have to be careful how you interpret this bit. "Normal operation" here assumes not just that the SSD is powered, but that it is actively used to perform IO. Writes to the SSD will eventually cause data to be refreshed as a consequence of wear leveling; if you write 1TB per month to a 1TB drive then every (in-use) cell will be refreshed approximately monthly, and data degradation won't be a problem.

If you have an extremely low-write workload, the natural turnover due to wear leveling won't keep the data particularly fresh and you'll be dependent on the SSD re-writing data when it notices (correctable) read errors, which means data that is never accessed could degrade without being caught. But in this scenario, you're writing so little to the drive that the flash stays more or less new, and should have quite long data retention even without refreshing stored data.

dataflow · on June 11, 2021

> When I think about it, I can't think of any compelling reason why that should be the case. Trapped electrons vs magnetic regions; I have no intuition which one of them is likely to be more stable.

My layman intuition (which could be totally wrong) is that trapped electrons have a natural tendency to escape due to pure thermal jitter. Whereas magnetic materials tend to stick together, so there's at least that. Don't how much of this matches the actual electron physics/technology though...

sharikone · on June 11, 2021

Hmm I don't think this is conclusive. Thermal jitter makes magnetic boundaries change too, and of course you have to add to it that it is more susceptible to magnetic interference.

I don't have intuition either, but I don't think this explanation is sufficient

madacol · on June 11, 2021

> Trapped electrons vs magnetic regions;

From the physics point of view, aren't both cases the same thing?.

Isn't magnetic regions a state of the electric field? so if I move electrons in and out, the electric field should be changing as well

mananaysiempre · on June 11, 2021

No. A region of a piece of material is magnetized in a certain direction when its (ionized) atoms are mostly oriented in that direction, the presence of a constant magnetic field is (roughly speaking) only a consequence of that.

So flash memory is about the electrons, while magnetic memory is about the ions.

pas · on June 11, 2021

Aren't permanent magnetics a direct result of oriented spins? (So due to quantum effects?)

dmitrygr · on June 11, 2021

Modern multi-bit-per-cell flash has quite terrible data retention. It is especially low if it is stored in a warm place. You'd be lucky to see ten years without an occasional re-read + error-correct + re-write operation going on

phonon · on June 11, 2021

Like https://en.wikipedia.org/wiki/ROM_cartridge ?

I think Nintendo uses https://www.mxic.com.tw/en-us/products/ROM/Pages/default.asp...

https://www.mxic.com.tw/CachePages/en-us-Product-ROM-default...

salawat · on June 12, 2021

Any SSD you go through the trouble of building a max capacity disk image for, then dd'ing onto the disk before removing?

I mean... This is general purpose HW here. Write once SSD is a workflow more than an economically tenable use-case in terms of making massive size write once then burn the write circuit devices.

wtallis · on June 11, 2021

I don't think anyone would make literally write-once drives with flash memory; that's more optical disk territory. But zoned SSDs and host-managed SMR hard drives make explicit the distinction between writes and larger-scale erase operations, while still allowing random-access reads.

juloo · on June 11, 2021

That would be magnetic tapes.

alisonkisk · on June 11, 2021

Append-only garbage-collected storage was used in data center even when hard disks were (and are) popular because it's more reliable and scalable.

zdw · on June 11, 2021

inspired by that last sentence, the analogy could be rewritten as:

  - lines on page
  - pages of paper
  - whole notebooks

and might be easier for people to grok than the earlier houses/paint analogy.

jazzyjackson · on June 11, 2021

I don’t know, I like the drama of copying a neighborhood and tearing down the old one xD

jnwatson · on June 11, 2021

Reminds me of https://xkcd.com/1737/.

"When a datacenter catches fire, we just rope it off and rebuild one town over."

jdironman · on June 11, 2021

Speaking of xkcd, 2021 is return of "All your bases" See alt-text on image.

https://xkcd.com/286/

wand3r · on June 11, 2021

I think the explanation is sound maybe (I am not that familiar) but the analogy gets a bit lost when you talk about buckets of houses and buckets of vacant lots.

Maybe there is a better analogy or paradigm to view this through.

cduzz · on June 14, 2021

I should have been a little more clear -- the urban planner managing the house building / copying and neighborhood destruction (the realtime controller) The rules are: 1) You can build a house kinda quickly 2) You can't modify a house once it is built 3) you can only build a house on a vacant lot 4) you can change the "mailing address" (relative to the physical location) of the house 5) you can only knock down whole blocks of houses at once (not one at a time) 6) each time you flatten a block more crap accumulates in that block until after a while you can't build there anymore. 7) the flatten / rebuild step may be quite slow (because you have lots of houses to build) 8) You can lie and say you built a house before it is finished, if you don't have too many houses to build. (if you've got an SSD with a capacitor / battery or tiny cache and reserved area for that cache) 9) you've lied to the user and you actually have 5-100% more build-able area than you've advertised. 10) you have a finite area so eventually the dead space accumulates to the point where you can no longer safely build.

So -- you keep track of vacant lots and "dead" houses (abandoned but not flattened); whenever you've got spare time you will copy blocks with some ratio of "live" to abandoned houses to new lots so the new block only has live houses.

These pending / anticipatory compaction/garbage collection operations are what I refer to as "buckets" -- having to compact 300 (neighborhoods) blocks to achieve 300 writes is going to result in glacial performance because of this huge write amplification (behind the scenes the drive is duplicating 100s of mb / gb of data to write a small amount of user modifications)

As you might imagine, there are lots of strategies to how to approach this problem, some of which give you an SSD with extremely unpredictable (when full) performance, others will give a much more consistent but "slower" performance.

daniellarusso · on June 11, 2021

Spoiler alert - This is the plot to ‘The Prestige’.