Scalable Chat Check out the new USENIX Web site.

next up previous
Next: Summary Up: Sample Applications Previous: Scalable FTP Interface

Scalable Chat




: Implementation of the chat application. Chat rooms are modeled as files with reads corresponding to receiving conversation updates and writes to sending out a message. On a write, the WebFS updates all its clients; the updates are propagated to other servers in a lazy fashion.

The next application we implement is Internet chat. The application allows for individuals to enter and leave chat rooms to converse with others co-located in the same logical room. The chat application is implemented as Java applets run through a Web browser. Figure 7 depicts our implementation of the application. Individual chat rooms are modeled as files exported through WebFS [Vahdat et al. 1996], a file system allowing global URL read/write access. WebFS provides for negotiation of various cache consistency protocols on file open.

We extended WebFS to implement a scalable caching policy suitable to the chat application. In this model, when a user wishes to enter a chat room, the client simply opens a well-known file associated with the room. This operation registers the client with WebFS. Read and write operations on the file correspond to receiving messages from other chatters and sending a message out to the room, respectively. On receiving a file update (new message), WebFS sends the update to all clients which had opened the file for reading (i.e., all chatters in a room). In this case, the client interface applet consists of two threads, a read thread continuously polling the chat file and an event thread writing user input to the chat file. These read/write requests are sent to the chat server via the director applet.

The director sends the request to the hostname that represents the best service node at the time. If the request does not complete, the request raises an exception to the director applet. The director applet then calls the service-specific cleanup routine for the request, and resends it to another service node. Note that the request takes a service specific failure event, such as chat file not found or WebFS server is down, and translates it into a general exception. Thus, the director applet can be written for a cluster of machines and reused for many different protocols: FTP, Telnet and chat.

From the above discussion, it is clear that a single WebFS server can quickly become a performance bottleneck as the number of simultaneous users is scaled. To provide system scalability, we allow multiple WebFS servers to handle client requests for a single file. Each server keeps a local copy of the chat file. Upon receiving a client update, WebFS distributes the updates to each of the chat clients connected to it. WebFS also accumulates updates, and every 300 ms propagates the updates to other servers in the WebFS group. This caching model allows for out of order message delivery, but we deemed such semantics to be acceptable for a chat application. If it is determined that such semantics are insufficient, well-known distribution techniques [Birman 1993][Ladin et al. 1992] can be used to provide strong ordering of updates.



: Chat response times in the face of server load. The chat application delivers latencies of approximately 10 ms under normal circumstances. On server failure, the applications takes one second to switch to a peer server.

Since the read requests are idempotent, and the write requests are atomic with respect to WebFS, the chat application is completely tolerant to server crashes. This fault transparency provides the illusion of a single, highly-available chat server machine to the programmer of the Chat client interface applet. Figure 8 demonstrates the behavior of the chat application in the face of a failure to the client's primary server. The graph plots response time as a function of elapsed time. The graph shows that chat delivers less than 5 ms latency to the end user. On detecting a failure, the latency jumps to 1 second before switching to a secondary WebFS server, at which point the latency returns to normal.

next up previous
Next: Summary Up: Sample Applications Previous: Scalable FTP Interface

Amin Vahdat
Mon Nov 18 15:34:35 PST 1996