Using our workloads, we find that the main Flash process is blocking inside the kernel on operations other than the select() or kevent() and the system shows idle CPU time. While CPU idle time is not surprising for a workload that accesses disk, the main process in Flash should never block - all the disk activity should be channeled to the helpers.
Examining the number of ready file descriptors returned per invocation of select() or kevent() provides more evidence of blocking. These calls form the main loop of an event-driven server, and are invoked as many times as needed as long as the system is active. Event handlers take corresponding actions based on the returned file descriptor value and action indicator. The number of ready descriptors returned by the select() or kevent() call reflects the queue length which will be processed by event handlers. The CDF of the number of ready descriptors is shown in Figure 1, and indicates that these calls typically return a large number of ready events per call. For select(), the median number of ready descriptors is 12, the mean is 61 and the maximum length is more than 600. More than 25% of the invocations return over 100 ready descriptors. The distribution for kevent() is similar.
In this workload, the CPU should never be idle - even if the amount of work available decreases, the main loop should call select() or kevent() more often, decreasing the number of ready descriptors per call. Only when one ready descriptor is returned per call should the CPU exhibit any idle time. However, given the idle time and the observed blocking, we can see that the blocking is causing both the CPU idle time and the batching. Even though descriptors are ready for servicing and idle CPU exists, the blocking system calls are artificially limiting performance and increasing latency.