Replica Selection Algorithms

The performance improvement achieved by using a replicated Web service, such as Web++, critically depends on the design of an algorithm that selects one of the replicas of the requested resource. The topic has been recently a subject of intensive study in the context of Internet services [11,22,23,38,17,35,27,19]. Each of the replica selection algorithms can be described by the goals that should be achieved by replica selection, the metrics that are used for replica selection and finally, the mechanisms used for measuring the metrics. The replica selection algorithms may aim at maximizing network throughput [22,19], reducing load on ``expensive'' links or reducing the response time perceived by the user [11,38,17,35,27]. Most replica selection algorithms aim at selection of ``nearby'' replicas to either reduce response time or the load on network links. The choice of a metric, which defines what are the ``nearby'' replicas, is crucial because it determines the effectiveness of achieving the goals of replica selection and also the overhead resulting from measurement of the metric. The metrics include response time[17,27], latency [35], ping round-trip time[11], network bandwidth [38], number of hops [11,22] or geographic proximity [23]. Since most of the above metrics are dynamic, replica selection algorithms typically rely on estimating the current value of the metric using samples collected in the past. The selected metric can be measured either actively by polling the servers holding the replicas [19] or passively by collecting information about previously sent requests [38] or a combination of both [17,35].