Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As I said, it has been a long time!

I have dim memories of using random k basis vectors to convert high dimensionality feature vectors to k dimensions, and doing m times to generate multiple projections as part of a an LSH schema. Min-hashing might have been involved.



IIRC, minhashing is used to approximate Jacquard similarity (a set-theoretic measure), while random hyperplanes (aka simhashing) is used to approximate cosine similarity (a geometric/algebraic measure). So they solve different problems, even though some problems can be cast in terms of either framework.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: