next up previous
Next: Simulation Up: Mechanism Previous: The Prefetch Algorithm

Compatibility Computation

We now take a closer look at the compatibility function. This function, essentially a similarity metric between access trees, abstracts out the complexities of prefetch analysis and pattern tree maintenance. We illustrate the definition with the example in Figure 2. A pattern tree is shown in 2(a).

  figure70
Figure 2: Computation of Compatibility. Graph (a) shows a pattern tree. Graph (b) shows a finished working tree, bearing a 0.575 compatibility with the given pattern tree. Graph (c) shows a unfinished working tree, so far bearing a 0.5 compatibility with the pattern tree; E is the pivot in the pattern tree, since it corresponds to the latest child node in the working tree being formed.

Let us first consider the case that the working tree is a finished one. We try to pair up the immediate child nodes in the working tree with identical child nodes in the pattern tree, preserving the order of the nodes. Recall that a child node can represent either an executable file, which may root another access tree, or a data file. For the two trees to be considered to describe the same file access pattern, we require that there be a one-to-one correspondence between all the executable files, but not all the data files. However, the same executable file may root a different access subtree in the working tree than in the pattern tree. We define tex2html_wrap_inline842 as the percentage of data files that can be paired up, tex2html_wrap_inline844 as the percentage of pairs of executables that root the same access subtree. Intuitively, tex2html_wrap_inline842 suggests how ``compatible'' the data files are, tex2html_wrap_inline844 suggests how ``compatible'' the executable files are. We choose to use the average of the two values as the compatibility of the two trees in question.

Let us consider the finished working tree in Figure 2(b). There are three data files in the working tree and two in the pattern tree. Out of these five files, only the two appearances of E can be paired up, making tex2html_wrap_inline842 0.4. Three out of four executable pairs (B, F and G) root identical access subtrees, so tex2html_wrap_inline844 is 0.75. The compatibility is thus 0.575.

The case of an unfinished working tree is similar except that one child in the pattern tree is first determined to be the pivot node. The pivot corresponds to the most recently added child node in the working tree. Only the child nodes in the pattern tree that appear before the pivot are involved in compatibility computation. If the pattern tree is selected, we prefetch those files that follow the pivot in the sequence of pattern tree preordering, since those before the pivot probably have already been accessed. Given the unfinished working tree in Figure 2(c), node E is the pivot in the pattern tree. tex2html_wrap_inline842 and tex2html_wrap_inline844 are both 0.5, giving rise to a 0.5 compatibility. If we decide to prefetch this pattern tree, only the files in subtrees tex2html_wrap_inline858 and tex2html_wrap_inline860 will be prefetched.

Since the compatibility function is invoked often, it is important that it not be expensive. That is why we examine only immediate child nodes. The time complexity of the computation is proportional to the number of child nodes.


next up previous
Next: Simulation Up: Mechanism Previous: The Prefetch Algorithm

An Analytical Approach to File Prefetching
Hui Lei and Dan Duchamp