It is widely accepted that botnets pose one of the most significant threats to the Internet. For the most part, this belief has been supported by the conjecture that at any moment in time, there is a large collection of well-connected compromised machines that can be coordinated to partake in malicious activities at the whim of their botmaster(s). Indeed, the potential threat of botnets comprising several hundred thousands bots has recently captured the headlines of the press [11,18], but the question of size itself, continues to be a point of debate among the research community.

In particular, the question of how we arrive at size estimates, or more importantly, just what they mean, remains unanswered. As a case in point, while earlier studies (e.g.,  [4,5,14]) have proposed a number of techniques to measure the size of botnets, they provide very inconsistent estimates. For example, while Dagon et al. [5] established that botnet sizes can reach 350,000 members, the study of Rajab et al. [14] seems to indicate that the effective sizes of botnets rarely exceed a few thousand bots. Clearly, something is amiss.

In this paper, we attempt to shed light on the question of botnet membership. Our study primarily focuses on IRC botnets because of their continuing prominence in the Internet today. Specifically, we survey a number of techniques for determining botnet membership and examine the different views they generate. As we show later, the inconsistency among the resulting outcomes is largely tied to the counting techniques being used, and does not necessarily reflect a change in underlying activity during the time that these studies were undertaken. For example, one of the botnets we tracked appeared to consist of 48,500 bots over the entire tracking period. However, if we examine the bots that simultaneously appeared on the bot server in question, the size does not exceed 3,000 bots. At a high level, this suggests that expecting a single definitive answer to the question of botnet membership is unreasonable. Instead, ``botnet size'' should be a qualified term that includes the specifics of the counting method, its caveats, and the context in which the measurements are relevant.

Additionally, we show that the issue of botnet membership extends beyond single botnet considerations in that potential cross-botnet relationships add another challenge to estimating membership. Specifically, our preliminary insights raise questions about the extent to which we can assert that two or more botnets are different, or more importantly, the degree of overlap among the populations of different botnets.

In summary, this paper makes the following contributions: (i) we explore a number of mechanisms (including prior work) for estimating botnet sizes and highlight the challenges associated with each, (ii) we present results of applying these techniques to data derived from a large-scale measurement study and show the extent of the discrepancy between the different size estimates, and (iii) we examine potential hidden structures among botnets we tracked and highlight their implication on determining botnet membership.

The remainder of this paper is organized as follows. Section 2 provides a comprehensive list of botnet size estimation techniques and highlights the challenges associated with each. In Section 3 we present the results of applying these techniques to botnet data collected from a wide-scale monitoring experiment. In Section 4 we examine the existence of hidden structures among the botnets we tracked, while Section 5 discusses related work. We conclude in Section 6 with a discussion about the subtleties associated with counting botnet membership.

