Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Stream_libarchive: workaround various types of locale braindeath (github.com/mpv-player)
26 points by ygra on Nov 12, 2017 | hide | past | favorite | 5 comments


I can't believe it took this long for somebody to say it, to be honest. It's not exactly the best-kept secret that locale is one of the most horrifically broken things in modern operating systems.

Windows has its own problems. As pretty much any modern Windows coder will notice, all the APIs that take locale-dependent characters have an 'ANSI' and 'UNICODE/Wide' version. As far as I know this distinction was added with Windows NT and patched over with horrible macros (#define UNICODE anyone?) When Windows NT was developed originally, 'UNICODE' was virtually synonymous with UCS-2.. or at least Microsoft and Sun Microsystems seemed to think so.

ANSI variants actually accept plenty of multi-byte encodings, like Shift-JIS. And hey, there's even a codepage entry for UTF-8!... and it doesn't work at all. Why has always been a mystery, because for a lot of English programs, a UTF-8 codepage would've magically made everything work with UNICODE seamlessly. Initially I wondered if I was just missing something, but I made a small hook that did the translations automatically for a few APIs and yep, it worked. So Microsoft created this horrid system of 'non-Unicode' locales, and 'wide' APIs, but figured UTF-8 was a waste of time I guess.

You can, of course, change Windows locales per-thread, but that doesn't stop Microsoft from making it an absolute PITA to switch your so-called non-Unicode locale, which is not only considered global per-app, but global on the system and can only be changed by rebooting. I'll be honest, I'm not really sure why this is necessary to any degree, but it's exactly the same in Windows 10.

Even though it's considered 'legacy', it still affects all kinds of stuff you wouldn't expect. The most noticeable thing is that in the Japanese legacy locale, your backslashes turn into... yen! This is an extremely legacy behavior all the way back to the first Japanese versions of MS-DOS.

Instead of fixing this to any degree, like perhaps adding a compatibility option per-program, Microsoft released a program called AppLocale that awkwardly lets you set the locale per app. It's clunky.


I am so conflicted about the morality of this pull request, it's a work of art.

He insults a bunch of unknown people in a terrible way, but kindly points that out and then proceeds to tell you he'll now justify the statement.

And to sum it all up, it's attached to a commit that actually fixes the problem (its not just a rant without a solution).

I've never been so conflicted how I feel about something.


I'm a novice Linux user, and locales seem useless to me. Everytime I install Ubuntu locally, I have to reset the system locale to UTF8.en_US, because the default it picks (UTF8.en_HK) crashes a lot of programs due to "unsupported locale" errors.


I have always been amused by the "C" locale. It always seemed like some standards writer punting on a hard problem.

As for the article I agree 100%. The way Unix Locales work always seemed like a horrible hack that did nothing but break programs to me. Granted, as an English speaker I was never the target audience, but even for the people they were supposed to help it seemed to make more problems than solutions.


So true. Can't be put any more eloquently.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: