Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I thought 200x is too extreme as well. In compression literature, is there a way to estimate the upper limit on lossless compressibility of a given data set?


There is not, because there could always be some underlying tricky generator that you just haven't discovered, and discovering that pattern is basically equivalent to solving the halting problem. (See https://en.wikipedia.org/wiki/Kolmogorov_complexity#Uncomput...)

As a trivial example, if your dataset is one trillion binary digits of pi, it is essentially incompressible by any regular compressor, but you can fit a generator well under 1 kB.


Cool. Thanks. How about lossy compression?


The same, since lossy compression can never be worse than lossless compression. (Also, it is more complex since you have to define your loss somehow. These Neuralink samples seemingly come as .wav files, but you probably wouldn't want to encode them with MP3!)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: