Table 2: Top 10 operating systems (a) and ports (b) among the 2,963 general-purpose hosts.
Name Count (%)
Windows 1604 (54.1)
Solaris 301 (10.1)
Mac OS X 296 (10.0)
Linux 296 (10.0)
Mac OS 204 (6.9)
FreeBSD 66 (2.2)
IRIX 60 (2.0)
HP-UX 32 (1.1)
BSD/OS 28 (0.9)
Tru64 Unix 22 (0.7)
Number Count (%)
139 (netbios-ssn) 1640 (55.3)
135 (epmap) 1496 (50.4)
445 (microsoft-ds) 1157 (39.0)
22 (sshd) 910 (30.7)
111 (sunrpc) 750 (25.3)
1025 (various) 735 (24.8)
25 (smtp) 575 (19.4)
80 (httpd) 534 (18.0)
21 (ftpd) 528 (17.8)
515 (printer) 462 (15.6)

Together, the hosts in our study have 2,569 attributes representing operating systems and open ports. Table 2 shows the ten most prevalent operating systems and open ports identified on the general purpose hosts. Table 2.a shows the number and percentage of hosts running the named operating systems. As expected, Windows is the most prevalent OS (54% of general purpose hosts). Individually, Unix variants vary in prevalence (0.03-10%), but collectively they comprise a substantial fraction of the hosts (38%).

Table 2.b shows the most prevalent open ports on the hosts and the network services typically associated with those port numbers. These ports correspond to services running on hosts, and represent the points of vulnerability for hosts. On average, each host had seven ports open. However, the number of ports per host varied considerably, with 170 hosts only having one port open while one host (running a firewall software) had 180 ports open. Windows services dominate the network services running on hosts, with netbios-ssn (55%), epmap (50%), and domain services (39%) topping the list. The most prevalent services typically associated with Unix are ssh (31%) and sunrpc (25%). Web servers on port 80 are roughly as prevalent as ftp (18%).

These results show that the software diversity is significantly skewed. Most hosts have open ports that are shared by many other hosts (Table 2 lists specific examples). However, most attributes are found on few hosts, i.e., most open ports are open on only a few hosts. From our traces, we observe that the first 20 most prevalent attributes are found on 10% or more of hosts, but the remaining attributes are found on fewer hosts.

These results are encouraging for the process of finding cores. Having many attributes that are not widely shared makes it easier to find replicas that cover each other's attributes, preventing a correlated failure from affecting all replicas. We examine this issue next.

