Next: Defining Similarity Up: Role Classification of Hosts Previous: System Overview

# Model

In this section, we develop a model for thinking about the grouping problem. We define the problem in the abstract, providing a model with several functions and parameters that can be adjusted to meet various goals. Later in the paper, we present and evaluate instantiations of these parameters.
• Let I be the set of hosts in an enterprise network. We will use |I| to denote the number of hosts in I.
• Let similarity be a commutative function from pairs of hosts in I to an integer greater than or equal to 0. Roughly speaking, if similarity(h1, h2) is high, then we would like our grouping algorithm to place the hosts h1 and h2 in the same group. Defining similarity so that it is both efficient to compute and yields a good grouping is at the heart of the problem addressed in this paper.
• A partitioning P of I respects similarity if for all distinct groups , , and ,
• similarity similarity(h1,h3)
• similarity similarity(h2,h3)
We extend this definition of similarity to define the average similarity between a host h1 and a group G2, avg_similarity(h1, G2), as the ratio of the sum of the similarity between h1 and each to the number of hosts in G2:

A partitioning P of I respects avg_similarity if for all and , avg_similarity avg_similarity(h1, G2). Respecting similarity or avg_similarity is not sufficient to generate a useful partitioning of I. After all, a partitioning that puts all the nodes in one group or one that puts each node in a separate group respects similarity. We therefore provide a parameter that can be used by network administrators to control how aggressive the algorithm is in partitioning I into groups.
• Let Smin, the similarity threshold, be an integer greater 0. A partitioning respects similarity and Smin if it respects similarity and if, for h1 and h2 in G, similarity .
• A partitioning P of I is said to be maximal with respect to similarity and Smin if it respects similarity and Smin and there does not exist another partitioning of I that respects similarity and Smin and has fewer groups. By adjusting Smin, one gets a maximal grouping with fewer groups in which the members of each group are more similar to each other.

Subsections

Next: Defining Similarity Up: Role Classification of Hosts Previous: System Overview
Godfrey Tan 2003-04-01