Check out the new USENIX Web site. next up previous
Next: Cluster Operation Up: Prototype Implementation Previous: Overview

   
Layer-4 Switch

We implemented a fast software based layer-4 switch to be used as the front-end node. Even though hardware based, highly scalable layer-4 switches are commercially available, we decided to implement a software based switch for two reasons: we did not have a commercial layer-4 switch available to us, and we wanted to explore what level of scalability could be achieved with a software based switch.

The switch maintains a small amount of state for each client connection being handled by the cluster. This state is used to determine the cluster node where packets from the client are to be forwarded. Additionally, the switch maintains the load on each cluster node based on the number of client connections handled by that node.

Upon seeing a connection establishment packet (TCP SYN packet), the switch chooses a distributor on the least loaded node in the cluster. Subsequent packets from the client are sent to the same node unless an update message instructs the switch to forward packets to some other node (update messages are sent by the distributor after a TCP connection handoff). Upon connection termination, a close message is sent to the switch by the cluster node that handles the connection and is used for discarding connection state at the switch.

All outgoing data from the cluster is sent directly to the clients and does not pass through the switch. Only the packets sent by the clients are received and forwarded by the switch. To improve switch performance, the forwarding module in the switch avoids interrupts and uses soft timer based polling to receive network packets [5,25]. In Section 6, we report the forwarding throughput of this switch.

Using a front-end layer-4 switch (as opposed to using, for instance, DNS round-robin to distribute requests among the server nodes) offers several important advantages. The first is increased security. By hiding the back-end nodes of the cluster from the outside world, would-be attackers cannot directly attack these nodes. The second advantage is availability. By making the individual cluster nodes transparent to the client, failed or off-line server nodes can be replaced transparently. Finally, when combined with TCP handoff, the use of a switch increases efficiency, as ACK packets from clients need not be forwarded by the server node that originally received a request.

Of course, a front-end switch forms a single point of failure in the cluster. There are, however, a number of possible solutions to this problem, such as using a hot-swappable spare.


next up previous
Next: Cluster Operation Up: Prototype Implementation Previous: Overview
Peter Druschel
2000-04-25