Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't. Why would you want to cope with that? When does it matter that multiple bytearrays give same graph?


I used RDF canonicalization in a system that built a computation graph system where the inputs and outputs to a computation were one or multiple RDF graphs.

Many of the computations were doing things like inference that created new blank nodes, and were also doing so in a non-determinstic order, and at the same time many computations created structurally identical outputs (with a low cardinality of triples). By using RDF canonicalization as the basis for content addressing those small graphs, it became quite easy to avoid re-doing a lot of the computations that would have happened due to non-deterministic order. For larger graphs we just used a hash of the native serialization, as re-doing the computation was cheaper than trying to canonicalize.

Adding that canonicalization-based system gave the whole system a significant performance boost, so yeah, there are some scenarios where you "would want to cope with that".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: