Which workloads to use

Next: Sequential Streams Up: Workloads Previous: Workloads

Which workloads to use

The study of caching algorithms using traces is popular as it simplifies the experimental setup while allowing to simulate a real-life workload scenario. Another approach is to use synthetic workload generators which are flexible and can simulate a large number of scenarios for which it is not practical to obtain real-life traces. When the algorithms being tested involve prefetching, using traces is not a good choice. Firstly, the timing of the I/Os is crucial in the context of prefetching algorithms. A read can be a miss or hit depending on the amount of time that passed between consecutive requests. This is not the case with pure demand-paging algorithms where we will get the same hit ratio independent of the timing between the read requests. To ensure that we are faithful to the timing information in the traces we need to run the trace preferably on the same hardware that generated it in the first place. For example, it is impossible to run a trace from a data server with hundreds of disks on a setup with only a few disks. This requires us to either simulate the original hardware on which the trace was collected, or scale the speed of the trace based on the disparity of the two systems. When using older traces, we may also need to factor in the improvement of disk access times. Therefore, using traces for comparison of prefetching algorithms is extremely difficult and an approximation at best. We favor using versatile workload generators which can simulate both simple workloads and complex workloads like OLTP and Video-on-Demand.

Next: Sequential Streams Up: Workloads Previous: Workloads

root 2006-12-19