Prioritized handling of clients

Next: Controlling resource usage of Up: Performance Previous: Costs of new primitives

Prioritized handling of clients

Our next experiment tested the effectiveness of resource containers in enabling prioritized handling of clients by a Web server. We consider a scenario where a server's administrator wants to differentiate between two classes of clients (for example, based on payment tariffs).

Our experiment used an increasing number of low-priority clients to saturate a server, while a single high-priority client made requests of the server. All requests were for the same (static) 1KB file, with one request per connection. We measured the response time perceived by the high-priority client.

**Figure 11:** How T_high varies with load.
$\begin{figure} \centerline{% \input data/pri-data.tex} \end{figure}$

Figure 11 shows the results. The y-axis shows the response time seen by the high-priority client (T_high) as a function of the number of concurrent low-priority clients. The dotted curve shows how (T_high) varies when using the unmodified kernel. The application attempted to give preference to requests from the high-priority client by handling events on its socket, returned by select(), before events on other sockets. The figures shows that, despite this preferential treatment, (T_high) increases sharply when there are enough low-priority clients to saturate the server. This happens because most of request processing occurs inside the kernel, and so is uncontrolled.

The dashed and the solid curve in Figure 11 shows the effect of using resource containers. Here, the server uses two containers, with different numeric priorities, assigning the high-priority requests to one container, and the low-priority requests to another. The dashed curve, labeled ``With containers/select()'', shows the effect of resource containers with the application still using select() to wait for events. T_highincreases much less than in the original system. Resource containers allow the application to control resource consumption at almost all levels of the system. For example, TCP/IP processing, which is performed in FIFO order in classical systems, is now performed in priority order.

The remaining increase in response time is due to some known scalability problems of the select() system call [5,6]. These problems can be alleviated by a smart implementation described in [6], but some inefficiency is inherent to the semantics of the select() API. The problem is that each call to select() must specify, via a bitmap, the complete set of descriptors that the application is interested in. The kernel must check the status of each descriptor in this set. This causes overhead linear in the number of descriptors handled by the application.

The solid curve, labeled ``With containers/new event API'', shows the variation in T_high when the server uses a new scalable event API, described in [5]. In this case, T_high increases very slightly as the number of low-priority clients increases. The remaining slight increase in T_high reflects the cost of packet-arrival interrupts from low-priority connections. The kernel must handle these interrupts and invoke a packet filter to determine the priority of the packet.

Next: Controlling resource usage of Up: Performance Previous: Costs of new primitives

Gaurav Banga
1998-12-17