Clearly tongue-in-cheek, and informative. So nice somehow. But still: the fact t...

sho_hn · on Oct 18, 2014

> and informative.

There's a lot of half-knowledge and misdirection in this blog post, though.

* Languages aren't writing systems, and vice-versa.

* There are often decent reasons and upsides to the things he calls out as sucking. His rant on Hangul, for example - for human users, the morpho-syllabic grouping of letters into blocks in Korean writing tends to be beneficial, because they make the orthography of morphemes in different contexts more regular and morpheme reuse smoother. I also tend to consider the way Hangul is implemented in Unicode actually quite clever (you can use a formula to map from the code points of the individual letters to the code points of the pre-composed blocks). (I've contrasted Chinese and Hangul a little in this LWN comment: http://lwn.net/Articles/608260/)

* The German ß isn't a simple ligature for 'ss'. It has a different history, an orthographic dimension and also serves as a useful prounciation hint.

* Point 5, "The width of a character depends on context", is full of half-knowledge. First of, it's not "Jamo vowel" - 자모 (ja-mo) is the Korean word for "letter", so it's "vowel jamo" if anything. The alphabet is called 한글 (han-gul). Second, combining characters aren't in any way unique to Asian scripts in Unicode. Unicode contains separate code points for the individual Hangul letters and the blocks they can be combined into. Unicode also contains separate code points for Latin letters and the various diacritics and accents they combine with in Latin-derived Western scripts, along with code points for those combinations. For human users, these sort of consistent combinatorics in a script are a boon, and that Unicode chooses to represent both primitive elements and combined versions is a general trait of Unicode.

In general, the thrust here is "your writing systems sucks for me while doing my job", which is subject to this guy's performance and qualifications. It's not the same as "writing system sucks for its users". I find the specter of stuff like this catching on as memes among linguistically and culturally ignorant hackers (because hey, we write English, so we're in the doing-it-right camp, right?) frightening, because programmer convenience is only one factor among many, and most likely not the most important one.

jkrems · on Oct 18, 2014

I think the author makes it pretty obvious that he's only speaking about "your writing systems sucks for me while doing my job". Sure, the title is hyperbolic - but with a title like "Your language sucks" nobody should expect it to be anything else. That's like saying the title "Dependency Injection will be the death of me" is flawed because it's highly unlikely the author will actually die because of it.

sho_hn · on Oct 18, 2014

Sure, but I think it's fair to be just as unforgiving in a critique then. My concern here is that we hackers love it when (seemingly) low-hanging fruit are pointed out and react with "well this sucks - let's drop it" very easily, and if this happens to someone as a result of reading this blog, it's bad. That said, I don't think eye-for-an-eye discourse is nice, either, and would certainly prefer it if this was a more rigorously fact-checked article.

bane · on Oct 18, 2014

I think your points are correct. However, it really is difficult and requires lots of specialized domain knowledge to properly support a big group of the world's languages. I'd almost go so far as to hire a specialist developer just to do international character handling if that was a requirement of the software.

sho_hn · on Oct 18, 2014

It is difficult and does require some learning, yeah. In fact, it's more difficult than the article hints at, because while it falsely speaks of languages it's actually only concerned with writing systems, and a lot of the more interesting localization problems happen on the actual language/grammar level (for example, plain gettext isn't good enough to handle Korean post-positions, so we try to do better[1] in KDE).

But we can and have built reusable tools to shoulder much of the burden, and because these problems are so fundamental and culture-enabling, it can actually be a really rewarding and satisfying area to work in. In fact, I bet if you catch the author on a better day, I wonder if he wouldn't agree.

1 = https://techbase.kde.org/Localization/Concepts/Transcript#Ko...

Argorak · on Oct 18, 2014

Yes. You probably should. But there is no point in ranting, the same is true if you want a good graphics system.

clairity · on Oct 18, 2014

i think the blocks-as-syllables concept in hangul makes lots of sense. that, and the straightforward mapping of letters to the english alphabet, made it fairly easy to learn how to read korean. understanding what you're reading is a completely different matter however.

sho_hn · on Oct 18, 2014

The funky part being that the blocks aren't just syllables, the boundaries between blocks often correspond to the boundaries between morphemes, because of how the phonotactics of the language have evolved and Hangul was designed to fit them. To use English as an example, the word "cats" contains two morphemes, the stem "cat" and the plural suffix "s". In Korean+Hangul, this sort of combination tends to split neatly into blocks, and when there are multiple options for how the letters can be distributed over them, the orthography usually prefers the variant that keeps the morpheme spelled consistently everywhere. This gives the writing a neat sort of lego blocks feel.

There's plenty of other nifty traits to Hangul, such as the way many letters derive from each other graphically and thus lend themselves to two-step entry and reduced-number-of-keys inputs like keypads. And the design of the letters is featural, not arbitrary, that is they are often visualizations of tongue position or mouth shape when forming the respective sound.

The theme here is: There are many other concerns when it comes to the quality of a writing system. What the author is concerned with here isn't entirely unimportant, but a very, very narrow view.

clairity · on Oct 18, 2014

interesting. i knew about the letters shapes resembling tongue/mouth positions but not that blocks represented morphemes, not just syllables.

sho_hn · on Oct 18, 2014

This is partly because while Korean and Chinese are distinct languages, a large part of the Korean vocabulary consists of imported Chinese stems, and Chinese is heavily monosyllabic. As such, a lot of Hangul blocks are close homophones to the sound values of Han characters.

bane · on Oct 18, 2014

In fact, English has had several great moments in history where the orthography could have been reformed.

One of my favorites is

https://en.wikipedia.org/wiki/Benjamin_Franklin's_phonetic_a...

marcosdumay · on Oct 18, 2014

English speaking people got luck in that phonology had almost no importance on the development of computing.