When The Lights Go Out

Next: Just How Big is Up: Size Matters, But What Previous: DNS Redirection.

When The Lights Go Out

When insider information is not available because bot activities are not echoed on the channel (and so can no longer be overheard by an IRC tracker), it is still possible to estimate a botnet's size by exploiting external information. In this case, however, techniques that rely on externally visible information can only provide an estimate of a botnet's footprint.

A natural source of externally visible information about a botnet's prevalence is DNS. In our earlier work [14], we explored the use of DNS cache snooping to uncover a botnet's footprint. In short, our approach takes advantage of the fact that bots normally make a DNS query to resolve the IP address of their IRC server before joining the command and control channel. Our technique estimates a botnet's DNS footprint by probing the caches of a large collection of DNS servers and recording all cache hits. A cache hit implies that at least one bot has queried its nameserver within the time to live (TTL) interval of the DNS entry corresponding to the botnet server. The total number of cache hits provides an indication of the botnet's DNS footprint.

That said, a botnet's DNS footprint provides (at best) only a lower bound of its actual footprint. For one, using DNS to estimate size is only possible for DNS servers that allow probes from arbitrary clients and reply to queries for cached results. To yield an accurate estimate, this technique requires a large list of such servers. Moreover, botnet servers that have DNS names with low TTL furhter complicates this technique because, for such names, the probability of a cache hit from an infected domain is low. Finally, a hit indicates only the existence of at least one infected host associated with that DNS server.

Recently, Ramachandran et al. [15] suggested another DNS-based technique. Their approach infers bot counts by observing DNS lookups for hosts listed in DNS-based blackhole lists. The rationale behind this approach is that botmasters tend to query these lists to detect if their bots are blacklisted and thereby unusable for certain tasks (e.g., sending spam e-mails). This approach has the potential to provide an overall estimate of possible bots in DNS-based blackhole lists, but clearly it cannot estimate the footprint or the live population of a specific botnet.

Next: Just How Big is Up: Size Matters, But What Previous: DNS Redirection.

Fabian Monrose 2007-04-03