Inflexible Storage

Another class of problems comes about when storing information needed by various protocols. This was already mentioned in the discussion of Figure 1, where not only does the use of sockaddr_in hard-code the address format of a particular protocol, but it also does not provide enough space to store the addresses of many protocols. The most common place where this problem comes into play is when used with getpeername(). Figure 3 shows an example of this from the 4.4BSD-Lite2's fingerd source; similar code sequences can be found in almost any server program. This code example also shows the assumption that the returned socket will be an IP socket; while originally a fair assumption, this needs to be fixed in order to be protocol-independent.

Figure 3: Use of a sockaddr_in to Store Arbitrary Addresses (4.4BSD fingerd)

        struct sockaddr_in sin;
	...
        if (logging) {
                sval = sizeof(sin);
                if (getpeername(0, (struct sockaddr *)&sin, &sval) < 0)
                        err("getpeername: %s", strerror(errno));
                if (hp = gethostbyaddr((char *)&sin.sin_addr.s_addr,
                    sizeof(sin.sin_addr.s_addr), AF_INET))
                        lp = hp->h_name;
                else
                        lp = inet_ntoa(sin.sin_addr);
                syslog(LOG_NOTICE, "query from %s", lp);
        }

A similar problem is also seen frequently in servers: the use of a generic sockaddr to store address information. Like the IP protocol specific structure, it is not big enough to hold addresses for many protocols (on most systems, the two structures are actually the same size). When the size of the address to be stored is known, a buffer of that size can be allocated. When it is not, a maximal-length buffer can be allocated using a sockaddr_storage, which will be discussed later.

A particularly bad special case of this problem comes about in some IP-only programs. Because IP addresses happen to be 32 bit unsigned integers and many modern systems have that as a native data type, some programs simply use integers to store IP addresses. Figure 4 shows an example from vat 4.0b2[5], which uses u_int32_ts internally to store network addresses (this is a bit less bad than using more generic integer types, but still hopelessly IP- dependent). Due to a particularly common example of this in earlier versions of BSD, this is sometimes referred to as the ``all the world's a u_long'' problem, and has a lot in common with the old ``all the world's a VAX'' problem. Optimizing assumptions are being made about the size and form of an address that happen to work on most currently interesting systems and protocols. But they're still poor assumptions that break portability, both in terms of supporting different systems and supporting different protocols. 4.4BSD-Lite2 has fixed this problem in many places by using in_addr instead, which is still protocol-dependent but at least is the correct type. In general, raw addresses should not be stored - socket addresses should be used instead.

Also, some protocols have variable-length addresses. Most existing programs treat addresses as fixed-length objects and do not store the real length as provided by run-time functions. Programs must store the length of addresses along with the addresses themselves - as with the address type, this can be necessary information for interpreting the address. This also means that the sizes of buffers used to hold addresses should not be arbitrarily bounded.

Using the generic sockaddr or the wrong protocol-specific structure also creates problems with alignment. Most network protocols have some alignment requirement for their protocol-specific address structures that may not be satisfied by other structures. Care must be taken to either use the correct protocol specific address structure or to arrange for the buffer used to store addresses to be properly aligned.

The generic sockaddr should be used as a structure to which an arbitrary socket address can be cast in order to access the sa_family and sa_len fields. While those fields should have the same type no matter what protocol specific structure is used to access the buffer, it is still good use of types to use the generic sockaddr for access where the network protocol in use are not yet known, rather than to using the wrong protocol-specific type.

Finally, many programs assume that a ``port'' is an integer. The concept of an integer port number is not universal. Some protocols use string service names instead, or use other formats that are at least convertable to a string. Service endpoints should be represented as strings that may or may not end up converted to another format for representation in a socket address.