Check out the new USENIX Web site. next up previous
Next: Inflexible Storage Up: Protocol Independence Problems Previous: Protocol Independence Problems

Hard Coding

The most obvious protocol dependence problem seen in network programs is to hard-code the use of one protocol. Figure 1 shows an example from 4.4BSD-Lite2[3]'s telnet program. There are three major problems here. First, the protocol family to be used is hard-coded as AF_INET. That basically prevents protocols other than IP from being used. The family needs to be chosen based on the name resolution information, as will be discussed later. Second, the socket address used is a protocol-dependent address, in this case sockaddr_in. This structure is not big enough to hold addresses for some protocols, and in any case manipulating the fields in the structure itself is a protocol-dependent activity. Sockaddrs need to be treated as an opaque buffer manipulated by protocol-independent library functions or carefully guarded code. Third, IP-specific socket options are being used without any guards. That is, if the first two problems were fixed, the IP-specific setsockopt calls would still be done and they should always fail. Depending on the particular option being set, the socket option call needs to be replaced with an abstract equivalent or needs to be surrounded by a guard that skips the call if the protocol in use is not IP.

This particular bit of code also carries a common bug: it tries to be slightly protocol-independent and ends up worse off for the effort. It uses the protocol family returned by gethostbyname() and copies addresses in a variable-length way, but copies that into a field within a sockaddr_in and later tries to connect() to that address using an AF_INET socket while specifying the length of the sockaddr_in as the length of the address information. If the family was something other than AF_INET, the sockaddr_in would probably not be filled in with something meaningful, and connect() call would probably fail regardless because the protocol family of the target address was not the same as that of the socket. As long as the only addresses that ever get returned by gethostbyname() are IP addresses, this practice will actually work. If addresses other than IP addresses were returned, programs written this way would break. This creates an interesting problem: interfaces that might be made protocol independent cannot be, because legacy programs don't use them correctly and changing what they return would break software. Using a new interface designed for protocol independence (like getaddrinfo()) and using it correctly will solve this problem.

Figure 1: Hard-Coding the Network Protocol (4.4BSD Telnet)
        temp = inet_addr(hostp);
        if (temp != (unsigned long) -1) {
            sin.sin_addr.s_addr = temp;
            sin.sin_family = AF_INET;
            (void) strcpy(_hostname, hostp);
            hostname = _hostname;
        } else {
            host = gethostbyname(hostp);
            if (host) {
                sin.sin_family = host->h_addrtype;
#if     defined(h_addr)         /* In 4.3, this is a #define */
                memmove((caddr_t)&sin.sin_addr,
                                host->h_addr_list[0], host->h_length);
...
        net = socket(AF_INET, SOCK_STREAM, 0);
        setuid(getuid());
        if (net < 0) {
            perror("telnet: socket");
            return 0;
        }
#if     defined(IP_OPTIONS) && defined(IPPROTO_IP)
        if (srp && setsockopt(net, IPPROTO_IP, IP_OPTIONS, (char *)srp, srlen) <
 0)
                perror("setsockopt (IP_OPTIONS)");
#endif
...
        if (connect(net, (struct sockaddr *)&sin, sizeof (sin)) < 0) {

A variation of this problem is hard coding addressing information, such as addresses and ports. Figure 2 shows an example from Sendmail 8.7.6[4]. There are three major problems here. First, the code always treats the address as a sockaddr_in without any guards. As in the example above, this is bad for protocol independence. Second, the code hard-codes an address and a port. While this is sometimes useful, it is usually bad practice and always bad practice when not combined with a test to check the protocol family. Third, the code explicitly specifies TCP as the transport protocol being used. This hard-codes a transport protocol and implies that only a small number of network protocols are usable (those that TCP has been made to run over). The second and third problems can be solved by using protocol-independent name resolution functions correctly.

Figure 2: Hard Coding Addresses and Ports (Sendmail 8.7.6)
        if (DaemonAddr.sin.sin_family == 0)
                DaemonAddr.sin.sin_family = AF_INET;
        if (DaemonAddr.sin.sin_addr.s_addr == 0)
                DaemonAddr.sin.sin_addr.s_addr = INADDR_ANY;
        if (DaemonAddr.sin.sin_port == 0)
        {
                register struct servent *sp;

                sp = getservbyname("smtp", "tcp");
                if (sp == NULL)
                {
                        syserr("554 service \"smtp\" unknown");
                        DaemonAddr.sin.sin_port = htons(25);
                }
                else
                        DaemonAddr.sin.sin_port = sp->s_port;
        }


next up previous
Next: Inflexible Storage Up: Protocol Independence Problems Previous: Protocol Independence Problems
Craig Metz 2000-05-08