getaddrinfo cross-platform edge case behavior

The POSIX and Windows getaddrinfo function returns a list of IP addresses and port numbers for a given hostname and service (resp. port), superseding gethostbyname and getservbyname. Besides some flags, it accepts two string parameters. Either one of them is allowed to be null, representing localhost (or rather 0.0.0.0, depending on AI_PASSIVE) respectively an automatically assigned port. Both parameters being null at the same time, however, is forbidden, and leads to a EAI_NONAME error (WSAHOST_NOT_FOUND on Windows). What happens if the strings are empty ("\0") instead of null, however, is not covered by the spec and not really documented anywhere.

It turns out that there are quite a few differences there between the various operating systems, which is obviously likely to cause issues for Wine (an implementation of the Windows API on Posix/X systems). To get a clear understanding of how the different cases are handled, I put together a little D program which tests a few combinations of host name, port, and flag parameters (see end of post). The snippet could be written in C just the same, as getAddressInfo directly maps to getaddrinfo, I just chose D to avoid platform dependencies and writing an unduly large amount of more boilerplate code.

The results are summarized in the following table, where »loopback« means that the IP addresses returned were 127.0.0.1 and ::1, »catchall« refers to 0.0.0.0 and ::, »public« means that the actual IP addresses of all available network interfaces were returned, and NONAME refers to a lookup error. »hostname« means that the actual fully qualified name of the host that ran the test was used (care: the host part of the FQDN only does usually not resolve on OS X).

Host Port Flags Windows Linux OS X
null null - NONAME NONAME NONAME
  AI_PASSIVE NONAME NONAME NONAME
  "" - loopback loopback NONAME
  AI_PASSIVE catchall catchall NONAME
  "0" - loopback loopback loopback
  AI_PASSIVE catchall catchall catchall
  "80" - loopback loopback loopback
  AI_PASSIVE catchall catchall catchall
"" null - public NONAME NONAME
  AI_PASSIVE public NONAME NONAME
  "" - public NONAME NONAME
  AI_PASSIVE public NONAME NONAME
  "0" - public NONAME loopback
  AI_PASSIVE public NONAME catchall
  "80" - public NONAME loopback
  AI_PASSIVE public NONAME catchall
"localhost" null - loopback loopback (v4) loopback
  AI_PASSIVE loopback loopback (v4) loopback
  "" - loopback loopback (v4) loopback
  AI_PASSIVE loopback loopback (v4) loopback
  "0" - loopback loopback (v4) loopback
  AI_PASSIVE loopback loopback (v4) loopback
  "80" - loopback loopback (v4) loopback
  AI_PASSIVE loopback loopback (v4) loopback
hostname null - public loopback (v4) public
  AI_PASSIVE public loopback (v4) public
  "" - public loopback (v4) public
  AI_PASSIVE public loopback (v4) public
  "0" - public loopback (v4) public
  AI_PASSIVE public loopback (v4) public
  "80" - public loopback (v4) public
  AI_PASSIVE public loopback (v4) public
getaddrinfo() behavior on Windows Server 2008 R2, Arch Linux (Kernel 3.1.4, glibc 2.14.1), and OS X 10.7.2 (Lion).

What caused me to investigate the issue in the first place is the behavior when given an empty, non-null host string: Windows returns the public addresses of the present interfaces, while OS X resolves them to ::1/::, but only if a port is given, and Linux doesn’t resolve them at all! Windows generally accepts the most combinations, returning an error only for the explicitly disallowed combination, which is relied on by some applications (e.g. the game League of Legends).

There were also some less significant differences in behavior which are mostly not listed in the table. First, in both of the Linux VMs I tried (an up-to-date Arch box and Ubuntu Oneric), only the IPv4 address of the loopback interface was returned. Second, as in the test no address family, socket type or protocol hints were passed to getaddrinfo(), each address was returned twice on OS X, once with SOCK_STREAM/IPPROTO_TCP and once with SOCK_DGRAM/IPPROTO_UDP set. Linux returned three copies of each address, for STREAM, DGRAM and RAW, with the according protocol types set, whereas Windows only returned a single copy with protocol type IPPROTO_IP and socket type set to 0.

In any case, as a result I have prepared a patch for Wine to emulate at least the succeeding/failing behavior of the Winsock incarnation of getaddrinfo on Linux and OS X, which should solve the bigger part of the related problems. There ideally shouldn’t be any Windows software relying on details beyond that (such as the actual number/layout of addresses returned), but who knows…

import std.algorithm, std.conv, std.range, std.socket, std.stdio;
alias AddressInfoFlags AIF;
void main() {
  foreach (host; [null, "", "localhost", Socket.hostName()])
  foreach (port; [null, "", "0", "80"])
  foreach (flags; [cast(AIF)0, AIF.PASSIVE]) {
    write(
      host ? "'" ~ host ~ "'" : "null", ":",
      port ? "'" ~ port ~ "'" : "null", " (", flags, "): "
    );
    try {
      writeln(
        join(
          map!q{text(a.address, " (", a.protocol, ")")}(
            sort!q{a.family < b.family}(
              getAddressInfo(host, port, flags)
            )
          ),
          ", "
        )
      );
    } catch (Exception e) {
      writefln("[%s]", e.msg);
    }
  }
}
D program used for gathering the data (longer than necessary to get semi-nice output).