Check out the new USENIX Web site. next up previous
Next: Linking the Linux Kernel Up: Andrew Benchmark Previous: Initial Results

EPCM Results

To address the need for a greater prefetch-lead time we modified our PCM kernel to implement EPCM based prefetching. Figure 7(a) shows the results from EPCM base prefetching compared with those from the previous section. From this graph we see that EPCM based prefetching reduced our elapsed times by 1.11 seconds or 12%. While this is a modest gain in total elapsed time for the benchmark, it is a significant reduction when one recalls that the best reduction possible is 1.21 seconds of I/O latency. Thus with EPCM based prefetching we reduced the time this benchmark spent waiting on I/O by 90%.


    
Figure: Reductions in elapsed times and read latencies for the Andrew benchmark with the last successor, PCM, EPCM and hot cache tests. Bars marked with P and E represent PCM and EPCM tests respectively. Partition sizes (ps) and model order (mo) are labeled as ps/mo.
\begin{figure}
\subfigure[Elapsed Time Reduction]{
\epsfig{figure=graphs/andre...
...tion]{
\epsfig{figure=graphs/andrew.read.epcm.eps,height=1.8in} }
\end{figure}

Figure 7(b) shows the results for the read latency reduction from EPCM based prefetching. The latencies for read system calls with EPCM based prefetching are as low as 127 microseconds, a reduction in read latency of 80%. This latency is less than the 139 microsecond latencies for the hot cache test. The EPCM based prefetching does better than the hot cache test because of how Linux 2.2.12 does not write data directly from the page cache, and must transfers data to the buffer for writing. The first part of the Andrew benchmark creates object files. As these files are written, they are moved from the page cache to the buffer cache. During the linking phase, we read all of this object file data. In the hot cache case, each read system call must copy the data from buffers in the buffer cache to a new page in the page cache. This buffer copy is time consuming. For files that are prefetched, this copy is done during the prefetch engine's execution and not during the read system call.

Figure 8 shows the distribution of read events for a typical hot cache test and a typical EPCM based prefetching test. The hot cache test has significantly more events that occur in the 129-256 microsecond bucket, while the EPCM test appears to account for that difference in 17-32 and 33-64 microsecond buckets. In other words, it appears many of the read system calls have become about 100 to 200 microseconds shorter as a result of the prefetching. In fact, during the selected hot cache run of the Andrew benchmark, we observed 1993 copies from the buffer cache to the page cache during read system calls. Since the predictive prefetching tests would do these buffer copies during their open and exec events the read system calls for those tests would not need to do these buffer copies. The result is that for this test on this kernel our predictive prefetching test has a lower read latency than that of the hot cache test where all the data is already in memory. This buffer copy problem has been fixed in version 2.4 of the Linux kernel.


  
Figure: Read system call latency distributions for selected runs of the Andrew benchmark (times in microseconds).
\begin{figure}
\subfigure[EPCM ps 64 or 45 Test]{
\epsfig{figure=graphs/epcm.r...
...he Test]{
\epsfig{figure=graphs/hot.read.lat.eps,height=1.8in} }
\end{figure}


next up previous
Next: Linking the Linux Kernel Up: Andrew Benchmark Previous: Initial Results
Tom M. Kroeger
2001-05-01