To clarify, the assumptions I'm making for the calculation are: 1) a Fixed proba...

ianopolous · on March 7, 2020

It boils down to the following:

You can calculate a probability L of losing a given file.

Because we've assumed totally uncorrelated failures that means this is the same for all files, and that the probability of losing NO files if you have T files is (1 - L)^T

As you can see, this approaches 0, meaning Pr(losing a file) approaches 1 as T increases.

Using the probability of file loss in Sia, which I would say is is too low, but lets ignore that. They get L = 10^-19.

This leads to T = ~10^19 before you expect to lose data. If you're erasure coding on the byte level, then that's 10 exa bytes.

I expect your probability of failure is much less than random nodes on a distributed global network of volunteers. so yes, ~petabyte is below the threshold, but there is a threshold.