2.4 Read/Write Path

Next: 3 Prototype Description Up: 2 General Ideas & Previous: 2.3 Batching Multiple Pages

2.4 Read/Write Path

Using all the proposed ideas, we built a preliminary prototype and we performed some measures and statistics. One of the most interesting results we obtained was the distribution of the two possible read-hit types: read hit due write and read hits due write.

Read hit due write:: this kind of hits appear when the page is in the cache because it has recently been swapped out but has not yet been discarded. This means that the page is requested short after it was swapped out.
Read hits due read:: this occurs when the page just requested is in a buffer that has recently been fetched from the disk. This means that another page, in the same disk block, has also been recently requested.

While examining both kind of hits, we detected that most of them were hits due write. This happens because the order in which pages are swapped out is not the same as the order in which they are swapped in. This led us to study the idea of not placing read buffers into the cache. This would allow recently written buffers to stay longer in the cache which might increase the hit ratio. Furthermore, this will also increase the write performance as less blocks will have to be sent to the disk.

In order to examine the effect of not placing read buffers in the cache, we implemented two versions of the preliminary prototype. A first one where the read buffers were placed in the cache and a second one where they were not. After running a set of benchmarks in both prototypes, we observed that the difference in the number of hits obtained by both systems was quite similar in most cases [3]. Furthermore, we also observed that the number of disk writes performed when reads do not interfere the cache is much lower than when reads are placed in the cache. This should increase the performance of the system as less writes are done and a similar number of reads are needed (similar read hit ratio).

Not placing read buffers in the cache has another interesting side effect. As reads do not need to make room in the cache, they will never have to perform a write operation to clean a dirty buffer. This will avoid many disk accesses while swapping in pages.

After this modification, the read disk blocks will not be placed into the cache. This does not mean that swapping-in operations will not take advantage of the cache. They will first try to find the page in the cache as it might have recently been written (read hit due write). If it is not in the cache, then the system will read the page, decompress it and forget about the rest of pages stored in the same disk block. Figure 2 shows the new path for swapping pages in and out.

Figure 2: New swapping path where swapped-in pages are not kept in the cache.

Finally, another important side effect of not caching read requests is a simplification on the code. We will not get into many details now, but it is clear that a sapping-in operation will only have to search the page in the cache or to read it from the disk. It will not have to worry about cleaning buffers from the cache and it will also avoid most of the locking problems.

Next: 3 Prototype Description Up: 2 General Ideas & Previous: 2.3 Batching Multiple Pages

Toni Cortes
Tue Apr 27 17:43:22 MET DST 1999