Check out the new USENIX Web site. next up previous
Next: Performance Evaluation Up: Run-Time System Previous: Prefetch Library

Kernel Support

The modification of the Linux kernel is mainly to ensure that the prefetch thread will be scheduled ahead of but not too far ahead of the computation thread. When the prefetch library of an AASFP application registers with the kernel the process ID of the application's prefetch and computation threads, the kernel allocates a prefetch queue for that application in the kernel address space. When an application's prefetch thread informs the kernel that its user-level prefetch queue is full, the kernel copies them to the application's kernel-level prefetch queue. If there is not enough space in the kernel-level prefetch queue, the prefetch thread is put to sleep. The kernel wakes up the prefetch thread only when there are enough free entries in the kernel's prefetch queue. The size of the kernel prefetch queue is larger than the prefetch library's queue to avoid unnecessary stalls. Note that the prefetch calls in the prefetch thread simply prepare disk prefetch requests, but do not actually initiate physical prefetch operations. In our experiment we use $300$ for the prefetch library queue size. This is an empirical value which is about $1/10$ of the total number of pages accessed by all our testing cases, as will be shown in Table 1. We also set a threshold number $64$, and when there are at least this number of free entries available in the kernel prefetch queue, the prefetch thread will be woken up. These numbers may require tuning for different architectures. One could use the simple Backward 1 case, as will be described in the performance evaluation section, as a good guideline on how to tune these numbers effectively.

For prefetchable files, the AASFP kernel turns off Linux's sequential prefetching mechanism and supports application-specific prefetching. Whenever an AASFP application's computation thread makes a disk access call, the kernel first satisfies this access with data already prefetched and stored in the file buffer, then performs asynchronous disk read for a certain number of requests in the kernel-level prefetch queue. That is, physical disk prefetching is triggered by disk accesses that the computation thread makes. This scheme works well for applications with periodic I/O calls. However, if an application performs a long computation followed by a burst of I/O, physical disk prefetch operations may be invoked too late to mask all disk I/O delay. Therefore AASFP uses a timer-driven approach to schedule disk prefetch operations. That is, every time Linux's timer interrupt occurs (roughly every $10ms$), the CPU scheduler will assign a higher priority to the prefetch thread so that the prefetch thread can get scheduled sooner in the near future if it does not find any entry in the kernel-level prefetch queue. Furthermore it will check whether there are prefetch entries in the kernel-level prefetch queue that should be moved to the disk queue according to an algorithm described next. Before a request in the kernel-level prefetch queue is serviced, the kernel checks whether this request is still valid by comparing its prefetch call ID with the number of disk I/O calls that the computation thread has made up to that point. If the prefetch entry's call ID is smaller, the entry is invalid and the kernel just deletes it. For a valid entry, the kernel dynamically determines whether to service that entry at that moment. To make this decision, the kernel maintains the current number of entries in the disk queue ($K$), the average time taken to service a disk request ($T$), and the average computation time between two consecutive disk I/O calls for the application ($C$). Suppose at time $t$, the current disk I/O call's ID is $i$, and the prefetch call ID for the entry is $j$. Then, the time available before the application accesses the block corresponding to the $j$th prefetch call is approximately $C \times (j - i) $. A prefetch request that is sent to the disk queue at $t$ will be expected to be completed at time $t + (K+1) \times T$. Therefore the kernel should service the prefetch request only if

\begin{displaymath}
(C \times (j-i)) - ((K+1) \times T) \leq Time\_Threshold \nonumber
\end{displaymath}  


\begin{displaymath}
(j - i) \leq Queue\_Threshold \nonumber
\end{displaymath}  

The first term ensures that the disk block is prefetched before it is needed, and the second term ensures that there are not too many prefetch blocks in the buffer cache. Keeping the file buffer cache from being over-committed is essential to prevent interference between current and future working sets. Both $Queue\_Threshold$ and $Time\_Threshold$ are empirical constants that need to be fine-tuned by the users based on hardware configurations and workload characteristics based on system performance; this tuning needs to be done only once per different architecture.


Table 1: Characteristics of test applications.
Test Scenario # of (4KB) Pages
Case Description Accessed
Vol Vis 1 16MB, orthographic view, 4KB-block 4096
Vol Vis 2 16MB, non-orthographic view, 4KB-block 3714
Vol Vis 3 16MB, orthographic view, 32KB-block 4096
Vol Vis 4 16MB, non-orthonormal view, 32KB-block 3856
FFT 256K 2MB, 256K points, 4KB-block 2944
FFT 512K 4MB, 512K points, 4KB-block 6400
Forward 1 16MB, read forward, 4KB stride 4096
Backward 1 16MB, read backward, 4KB stride 4096
Forward 2 16MB, read forward, 8KB stride 2048
Backward 2 16MB, read backward, 8KB stride 2048



next up previous
Next: Performance Evaluation Up: Run-Time System Previous: Prefetch Library
chuan-kai yang 2002-04-15