Check out the new USENIX Web site. next up previous
Next: State Machine Up: AFPA Previous: AFPA

Overview

AFPA is a flexible kernel-mode platform for high-performance network servers. The architecture is flexible in several ways. First, it can be applied to a variety of application protocols such as HTTP, FTP, LDAP, and DNS. Such protocols are implemented as AFPA modules. Second, it has been implemented on four platforms: Linux, Windows 2000, AIX, and OS/390. The latter three implementations have been incorporated into current IBM products, the first of which was released as the Netfinity Web Server Accelerator in 1998. The architecture was implemented on Linux and Windows 2000 solely as a kernel module. Third, it can be used as a caching server or an efficient layer 7 router. Fourth, it can be tightly integrated and co-located with a conventional user-level network server or implemented as a stand-alone front-end accelerator that offloads processing from a set of ``back-end'' servers without requiring any modification to the conventional servers. Fifth, AFPA can be used to enforce quality of service [23]. This section focuses on AFPA application to Web servers.

Several factors contribute to AFPA's efficiency. These factors are now described in terms of data movement, event notification, and communication code path for the Linux and Windows 2000 AFPA implementations. First, data copies and reads are eliminated, improving performance when sending larger responses. The data copies are avoided by passing references to pinned cache objects directly to the protocol stack. Reads are eliminated by avoiding checksum computation when sending cache objects as responses. On Windows 2000, checksum computation is off-loaded to the network interface hardware. On Linux, cache objects include pre-computed checksums.

Second, scheduling and context switching overhead in responding to TCP/IP events is significantly reduced or eliminated using AFPA. AFPA parses requests on the same software interrupt on which TCP/IP processing occurs. AFPA then sends corresponding responses from the same interrupt context or queues the response for sending in a thread context. In implementations where responses are derived from software interrupt context, no scheduling or context switching overhead is incurred. As shown in Section 5, responding from software interrupt provides better performance, but responses must reside in pinned memory. The AFPA module can also use a thread-based configuration where responses are sent from a thread context. This approach mitigates livelock problems inherent to the software interrupt approach [24]. A hybrid approach has also been implemented. Requests for content not currently pinned are processed on software interrupt, but unpinned responses are sent from a kernel-mode thread context where page faults are tolerated.

Third, the overall communication overhead incurred in AFPA implementations is less than a user-mode server relying on a socket API. AFPA interfaces directly with TCP/IP by overloading TCP/IP events with HTTP specific processing. These events include connection establishment (SYN), data arrival, and disconnection (FIN). In overloading these events, AFPA drives a state machine associated with the protocol modules such as HTTP. Direct integration with TCP avoids the queueing and descriptor management incurred using a socket API.


next up previous
Next: State Machine Up: AFPA Previous: AFPA
Philippe Joubert 2001-05-01