Next: Conclusions Up: Scalable Content-aware Request Distribution Previous: Real Workload

Related Work

A substantial body of work addresses the design of high-performance, scalable Web server clusters. The work includes cooperative caching proxies inside the network, push-based document distribution and other innovative techniques [7,10,13,22,23,28]. Our proposal addresses the complementary issue of providing support for scalable network servers that perform content-based request distribution.

Web servers based on clusters of workstations/PCs are widely used [18]. Most commercial Web switch products for cluster servers use a request distribution strategy that does not require examining the content of the request [2,20,14,9]. The most common such technique used for request distribution is some variant of weighted round-robin. Resonate, Inc. [27] is an exception in that their product offers content-aware request distribution using a method similar to TCP handoff.

Fox et al. [18] describe a layered architecture for building cluster-based network services. The architecture has a centralized load manager and several front-end and back-end nodes. This architecture is similar to the one shown in Figure 6; however, the request distribution strategy and the mechanisms employed are purely load-based and do not consider the content of the requests. Our work focuses on scalable cluster configurations for content-aware request distribution.

In [26], Pai et al. explore the use of content-based request distribution in a cluster Web server environment. This work presents an instance of a content-aware request distribution strategy, called LARD. The strategy achieves both locality, in order to increase hit rates in the Web servers' memory caches, and load balancing. Performance results with the LARD algorithm show substantial performance gains over WRR.

More recently, in [31], Zhang et al. explore another content-based request distribution algorithm that looks at static and dynamic content and also focuses on cache affinity. They confirmed the results of [26] by showing that focusing on locality can lead to significant improvements in cluster throughput.

The key to content-based request distribution is that a client's request is first inspected before a decision is made about which server node should handle the request. The difficulty lies therein, that in order to inspect a request, the client must first establish a connection with a node that will ultimately not handle the request. There are currently two known viable techniques that can be used to handle this situation. They are TCP splicing [15,12,29], and TCP handoff [6,26,19]. Our proposed approach offers a third alternative that scales well with the number of back-end nodes.

As mentioned earlier, the switch component of our cluster could easily be replaced by a commercial layer-4 switch. A number of layer-4+ network switch products [1,16,17,11] are currently available on the market. These commercial products use specialized hardware and advertise high performance. A subset of these switches are also advertised to be layer-7 switches, which means they can perform URL-aware routing. We are not aware of any published performance results for these switches when used for URL-aware distribution. However, since software processing is involved in layer-7 switching, we expect that these products have similar limitations to scalability as software based content-aware front-ends when used for this purpose.

Next: Conclusions Up: Scalable Content-aware Request Distribution Previous: Real Workload

Peter Druschel
2000-04-25