Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure if anyone uses Base36 any more (or its more obscure sister, Base32), but it uses [0-9, A-Z] as its alphabet. It is URL safe and also smaller than base 10 in character count for each number, and is the smallest standard URL-safe encoding that works with alphanumeric QR codes.

I sort of assumed this was common knowledge, but I guess not.



I implicitly ignored encoding schemes like base 36 and 32 (and 16, referenced elsewhere in the thread) because they're not as good as the schemes referenced in the post. The best you can get that's fully URL safe with Alphanumeric is a hypothetical base 39, referenced in a footnote, and only using 39 of the 45 possible characters has 3.9% overhead (even ignoring the 50% overhead of the https://www.rfc-editor.org/rfc/rfc9285.html encoding).

I've added an analysis of many more bases to the article: https://huonw.github.io/blog/2024/03/qr-base10-base64/#fn:ot...


I independently discovered base36 for a personal project recently and was very happy to have an explanation for why Python's base conversion goes up to 36.


Tooling is probably what dictates this more than anything. atob() is everywhere.


Yeah, I don't get it. Assume I have a standard URL with query params, the web browser doesn't understand the decimal encoding – right?

Let's assume... this: https://news.ycombinator.com/reply?id=39907672&goto=item%3Fi...

The special encoding is just about sending data to the backend?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: