Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Offline high speed data ingestion of multi-thousand file, multi hundred GB data sets, followed by rapid transfer to permanent online storage (and replication fan-out, etc).

Seems convenient to allow optimization for high speed sequential reads and random read/writes at different parts of the life cycle, along with indexing, crcs, signatures, etc.

One big issue with zip storing the index at the end of course is a truncated file basically lost most of it’s context and is generally unrecoverable even in part, which this could also help with from a durability perspective.

Storing it at the beginning (without a end pointer) opens up the possibility you have a valid looking archive you’re touching that is truncated and missing a lot of data, and won’t know until you look past the end (or validate total bytes or whatever, which doesn’t work well when steaming).

Storing the index at the beginning, pointer and file sig at the end, and all the other format extensions does solve for all this. Which is convenient.



neat, let me know if you have any further questions. Would love to make this a more common thing that happens to zip files. As result of me doing this in firefox, zip utilities(aka 7zip) started complaining a lot less about this creative interpretation of the standard :)




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: