Hi, I'm the author of this article. I'm grateful for the positive comments! I plan to explore FALCONN (pointed to by gajjanag) and the beautiful article of Bartosz Ciechanowski pointed out by lgvld.
I never wrote that up (as a non-academic article), but the fact that people like this article is a nudge for me to do that.
The above paper was the first LSH algorithm which deterministically returns all close-enough points and guarantees no false positives for distant-enough points. I've heard there are also other algorithms with that property published since. If I have time, I would like to one day see how that algorithm compares against the others in Erik Bernhardsson's benchmarks.
Z is the set of integers {.., -2, -1, 0, 1, 2, ...}.
Z is called "Z" due to influential German mathematicians who chose the letter to stand for "zahlen" ("number" in German).
As a fun fact, the letter "N" stands for "natural numbers," meaning either positive integers, or positive integers and zero. However, that letter (N) is _not_ standardized because mathematicians disagree about whether or not to include zero. When I separately told a couple of my professors this in grad school, they didn't believe me — one of them thinking it was crazy for zero to be excluded, the other thinking it was crazy to include it!
@author
a question if you don't mind. I saw this article and it was very interesting but I get the impression (or heard somewhere) that LSH is actually more suitable to much higher dimensional problems than 2D. Put another way, I think LSH is perhaps inappropriate for a two-dimensional case like this example where you could solve it more simply, but that would not scale to higher dimensions easily or at all, which is where LSH would work. Is this clear, and am I correct in my understanding?
This is a really well-done article! Very approachable for a non-math-expert. If it is useful at all, I noticed while reading that there are two misspellings. "moives" for "moves" and "tht" for "that". I am looking forward to more of your articles.
My original motivation for writing this was to explain (in a follow-up article) an ANN algorithm I published in 2010: https://epubs.siam.org/doi/pdf/10.1137/1.9781611973075.94
I never wrote that up (as a non-academic article), but the fact that people like this article is a nudge for me to do that.
The above paper was the first LSH algorithm which deterministically returns all close-enough points and guarantees no false positives for distant-enough points. I've heard there are also other algorithms with that property published since. If I have time, I would like to one day see how that algorithm compares against the others in Erik Bernhardsson's benchmarks.