Glimpse Indexing

Next: Patching and Building SSH Up: Evaluating Predictive Prefetching Previous: Linking the Linux Kernel

Glimpse Indexing

For our third benchmark we used a glimpse [20] index of /usr/doc to represent a traversal of all the files under a given directory. This workload is significantly larger than either of the two previously studied. For this benchmark we saw similar result to those from the Gnu ld benchmark. Specifically, the total benchmark runtime was reduced by 16%, the total I/O latency was reduced by 31% and read latencies were reduced by 92%.

The workload created by the glimpseindex program is a linear traversal of all the files in a large directory structure. We used version 4.1 of Glimpse and performed an index of /usr/doc. The order of files in their directory determines the order in which files are accessed. The large majority of files see only one access and are typically static files created when Linux was installed and have not been modified since. By comparison, access order in the Andrew benchmark's workload was dependent on the Makefile and the order in which header files were listed. Additionally, files such as header files and object files were accessed multiple times.

Tables 5 and 6 show the workload characteristics for the glimpse benchmark on our test machine. This workload contains significantly more disk accesses, a total of 24,901 reads. A much higher fraction of these reads are cache misses, 11,812 misses for a miss ratio of 0.47. The hot cache test has cache misses, indicating that this test accesses more data than the I/O caches can hold.

Table: Workload time summary for the glimpse benchmark. Elapsed times are in seconds, all other times are in microseconds. Numbers in italics represent 90% confidence intervals.
Test Elap. 90% Compute 90% Read 90%

Cold 172.0 0.84 82.7 0.12 1890 19.92

Hot 131.5 0.12 81.4 0.06 782 2.91

**Table:** *Workload time summary for the glimpse benchmark. Elapsed times are in seconds, all other times are in microseconds. Numbers in italics represent 90% confidence intervals.*
Test	Elap.	90%	Compute	90%	Read	90%
Cold	172.0	0.84	82.7	0.12	1890	19.92
Hot	131.5	0.12	81.4	0.06	782	2.91

Table: Read event count summary for the glimpse benchmark. Counts are the number of events that fell in that category averaged across the last 20 runs of the each test.
Test calls hits partial misses

Cold 24901 258 12828 11813

Hot 24901 5943 12819 6138

**Table:** *Read event count summary for the glimpse benchmark. Counts are the number of events that fell in that category averaged across the last 20 runs of the each test.*
Test	calls	hits	partial	misses
Cold	24901	258	12828	11813
Hot	24901	5943	12819	6138

Figure 10 shows the results for the glimpse benchmark. We saw the best results from the smallest EPCM test, reducing total runtime by 16%, read latencies reduced by as much as 92% and I/O latency by 31%. Our PCM test had a 22% reduction for this workload. The test of last successor based prefetching did the worst with an average total I/O latency reduction of 16%. Again we see the predictive prefetching has the potential for significant reductions in I/O latency and is effective at improving overall system performance.

**Figure:** Reductions in elapsed times and read latencies for the Glimpse benchmark with the last successor, PCM, EPCM and hot cache tests. Bars marked with P and E represent PCM and EPCM tests respectively. Partition sizes (ps) and model order (mo) are labeled as ps/mo.
$\begin{figure} \subfigure[Elapsed Time Reduction]{ \epsfig{figure=graphs/glimp... ...Reduction]{ \epsfig{figure=graphs/glimpse.read.eps,height=1.8in} } \end{figure}$

Next: Patching and Building SSH Up: Evaluating Predictive Prefetching Previous: Linking the Linux Kernel

Tom M. Kroeger
2001-05-01