Check out the new USENIX Web site. next up previous
Next: CUP Update Types Up: CUP Protocol Design Previous: CUP Protocol Design

CUP Overview

CUP is not tied to any particular search mechanism and therefore can be applied in both networks that perform structured search as well as networks that perform unstructured search. As described above, in structured search, queries follow a well-defined path from the querying node to an authority node that holds the index entries pertaining to the query [4,6,5,7]; in unstructured search, queries haphazardly travel through the network via flooding or random walks in search of index entries [2,8].

In the interest of space, in this paper we describe and evaluate how CUP works to maintain caches of index entries in structured peer-to-peer networks. The basic idea is that every node in the peer-to-peer network maintains two logical channels per neighbor: a query channel and an update channel. The query channel is used to forward search queries for objects of interest to the neighbor that is closest to the authority node holding the entries for those objects. The update channel is used to forward query responses asynchronously to a neighbor and to update index entries that are cached at the neighbor.

Queries for an item travel ``up'' the query channels of nodes along the path toward the authority node for that item. Updates travel ``down'' the update channels along the reverse path taken by a query. Figure 1 shows this process. The process of querying for items and updating cached index entries pertaining to those items forms a CUP tree, similar to an application-level multicast tree where vertices are peer nodes interested in receiving updates for cached index entries.

\includegraphics[height=6cm]{pics/LogicalChannel4.eps}
Figure 1: CUP Query & Update Channels. $ A_1$ and $ A_2$ are authority nodes for some objects. A query arriving at node $ N_2$ for an item for which $ A_1$ is the authority is pushed onto query channel $ Q_{N_1}$ to $ N_1$. If $ N_1$ has a cached unexpired entry for the item, it returns it to $ N_2$ through $ U_{N_1}$. Otherwise, it forwards the query towards $ A_1$. Any update for an item originating from authority node $ A_1$ flows downstream to $ N_1$ which may forward it onto $ N_2$ through $ U_{N_1}$. The analogous process holds for queries at $ N_1$ for items for which $ A_2$ is one of the authority nodes.

The query channel enables ``query coalescing''. If a node receives two or more queries for an item for which it does not have a fresh response, the node pushes only one instance of the query for that item up its query channel. This approach can have significant savings in traffic, because bursts of queries for an item are coalesced into a single request. Through simple bookkeeping (setting an interest bit) the node registers the interest of its neighbors so it knows which of its neighbors to push the query response to when it arrives.

The cascaded propagation of updates from authority nodes down the reverse paths of search queries has many advantages. First, updates extend the lifetime of cached entries allowing intermediate nodes to continue serving queries from their caches without re-issuing new queries. It has been shown that up to fifty percent of content hits at caches are instances where the content is valid but stale and therefore cannot be used without first being re-validated [9]. These occurrences are called freshness misses. Second, a node that proactively pushes updates to interested neighbors reduces its load of queries generated by those neighbors. Third, the further down an update gets pushed, the shorter the distance subsequent queries need to travel to reach a fresh cached answer. As a result, search query latency is reduced. Reducing search query latency is important because the user must wait until the search query has successfully returned a set of index entries before choosing from which replica node to download the content. Finally, updates can help prevent errors by invalidating outdated entries. For example, an update to delete a fresh but invalid index entry prevents a node from erroneously answering queries using the entry before it expires.


next up previous
Next: CUP Update Types Up: CUP Protocol Design Previous: CUP Protocol Design
Mema Roussopoulos 2003-04-04