Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As for why the trailing dot is disallowed, see <http://saynt2day.blogspot.com/2013/03/danger-of-trailing-dot....

The goal was to come up with a good regular expression to validate URLs as user input, and not to match any URL that browsers can handle (as per the URL Standard).

a.b--c.de is supposed to fail because `--` can only occur in Punycoded domain name labels, and those can only start with `xn--` (not `b--`).



Many of these regexes should not be used on user input, at least not if your regex library backtracks (uses NFAs), because of the risk of ReDoS: http://en.wikipedia.org/wiki/ReDoS

Trying to shoehorn NFAs into parsing stuff that isn't a regular expression is generally a bad idea. (See: Langsec.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: