Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ICU is very good, but even it isn't perfect, and they know it. http://userguide.icu-project.org/collation/customization:

"ICU provides a data-driven, flexible, and run-time-customizable mechanism called "tailoring". Tailoring overrides the default order of code points and the values of the ICU Collation Service attributes"

Also, you sometimes need context to properly sort strings. Examples:

Are you sorting phone book entries or items in a dictionary? In some languages, that does make a difference.

Are you sorting Swiss German or German German?

Given two 'obviously' Italian words, should you apply Italian collation rules? You probably/maybe shouldn't when both are words in an English-language dictionary.

Some library catalogs want(ed?) to sort "2001: a Space Odyssey" as "Two thousand and…" (http://en.m.wikipedia.org/wiki/Library_catalog#Sorting)

For the latter, even that ICU feature to customize sorting won't help you.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: