Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> When you open a listening socket on a host and request port 0 from the operating system, this call is typically interpreted as a request to return a single, currently unused port to the application

But... why? Surely some kind of separate BIND_UNUSED_PORT operation would have been fine, without needlessly overloading the meaning of "port 0".



Having a specific port number to mean "pick a random unused port" is very useful when you run any application that opens a listening TCP socket and takes the port number from the config. More often than not, the default behaviour (when the port is not specified) is to use some hardcoded port number like 5555 or something. But what if you want to run two/three/many copies of this app, dynamically, without them interefering with each other? One possibility is to do manual free port detection--but it's inherently racy. Another one is to specify port 0 and get the actual port number from the application by some side channel (say, by reading it from stderr/log file).

Now, I assure you that if this "some kind of separate BIND_UNUSED_PORT" operation was available in the Berkeley socket API from the start, it would be mostly unused, people would've just used normal bind() instead (it's way simpler), so the latter option simply becomes impossible: you have to manually pick the free ports for temporary servers. In fact, there are some existing applications that actively refuse to listen on port zero, for example, "ssh -D 0" doesn't work for some reason (patching it to remove the zero check breaks nothing, everything keeps working properly), and they're somewhat annoying to use transiently.


I would have just made sin_port be a 32 bit number, with UNUSED_PORT being some value in the otherwise invalid range, and the others giving you EINVAL on bind(3).


Or you could spare those 2 bytes at a cost of being unable to listen on one out of 65536 ports. In 1983, this tradeoff absolutely made sense: 64K ports should be enough for everyone.


Even in 1983, the kinds of machines expecting to be networked with IP could spare the occasional couple bytes here and there if it meant being able to signal out of band information to the stack without carving out a set of the wire formats as unencodable.


I guess, but what if it was just a call to bind_unused() instead of bind()?


Most programs would simply use bind(), instead of having a more complicated option parsing and condition around bind()/bind_unused():

    sockaddr_in sockaddr = { 0 };
    sockaddr.sin_port = HARDCODED_PORT;
    // other fields defaulted...
    
    // option parsing and filling sockaddr structure...

    if (has_port_number && flag_use_random_port) {
        error(...);
        exit(1);
    }
    if (has_port_number) {
        bind(s, sockaddr);
    } else {
        bind_unused_port(s, sockaddr);
    }
vs.

    sockaddr_in sockaddr = { 0 };
    sockaddr.sin_port = HARDCODED_PORT;
    // other fields defaulted...
    
    // option parsing and filling sockaddr structure...

    bind(s, sockaddr);
Yeah, people would just write the second version. And mind you, the sockaddr argument for bind_unused() still would have a local address to bind to, so what exactly is gained? Difference between 65535 or 65536 ports available doesn't warrant complicating API in such a manner, IMO.


They could have done something like that, but then you need to pass more than a 16-bit value in for port number. Back when that was a concern, limiting to only 65535 usable ports wasn't a big deal. Now, you probably could use a larger integer and pass a flag to really bind to port zero, other than there's probably a lot of network equipment that will drop it. If you also run on an IPv4 address with a final octet of 0 or 255, you'll have a really hard to reach service.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: