Next: The runtimes of the
For these simulations, we traced six programs on an Intel x86
architecture under the Linux operating system with a page size of 4KB
(we will study the effect of larger page sizes in
Section 4.3). The behavior of most of these programs is
described in more detail in [WJNB95]. Here is a brief
description of each:
- gnuplot: A plotting program with a large input
producing a scatter plot.
- rscheme: A bytecode-based interpreter for a
garbage-collected language. Its performance is dominated by the
runtime of a generational garbage collector.
- espresso: A circuit simulator.
- gcc: The component of the GNU C compiler that actually
performs C compilation.
- ghostscript: A PostScript formatting engine.
- p2c: A Pascal to C translator.
These programs constitute a good test selection for locality
experiments (as we try to test the adaptivity of our compressed
caching policy relative to locality patterns at various memory
sizes). Their data footprints vary widely: gnuplot and rscheme are
large programs (with over 14,000 and 2,000 pages, respectively), gcc
and ghostscript are medium-sized (around 550 pages), while espresso
and p2c are small (around 100 pages).
We used the following three processors:
- Pentium Pro at 180 MHz: This processor approximately
represents an average desktop computer at this time. Compressed
caching is not only for fast machines.
- UltraSPARC-10 300 Mhz: While one of the fastest
processors available now, it will be an average processor two years
from now. Compressed caching works even better on a faster processor.
- UltraSPARC-2 168 MHz: A slower SPARC machine which
provides an interesting comparison to the Pentium Pro, due to its
different architecture (e.g., faster memory subsystem).
We used three different compression algorithms in our experiments:
- WKdm: A recency based compressor that operates on
machine words and uses a direct-mapped, 16 word dictionary and a fast
- LZO: Specifically, LZO1F, is a carefully coded
Lempel-Ziv implementation designed to be fast, particularly on
decompression tasks. It is well suited to compressing small blocks of
data, using small codes when the dictionary is small. While all
compressors we study are written in C, this one also has a
speed-optimized implementation (in Intel x86 assembly) for the Pentium
- LZRW1: Another fast Lempel-Ziv implementation. This
algorithm was used by Douglis in [Dou93]. While it does not
perform as well as LZO, we wanted to demonstrate that even this
algorithm would allow for an effective compressed cache on today's
Next: The runtimes of the
Scott F. Kaplan