To adapt to recent program behavior our statistics are decayed exponentially with time. Time, however, is defined as the number of interesting events elapsed. Events that our system considers ``interesting'' are page touches that could affect our cost benefit analysis (i.e., would have been hits if we had compressed as much memory as any of our target compression sizes currently suggests). Defining time this way has the benefit that touches to very recently used pages are ignored, thus filtering out high-frequency events.
Additionally, the decay factor used is inversely proportional to the size of memory (total number of page frames), so that time typically advances more slowly for larger memories than for small ones--small memories usually have a shorter replacement cycle, and need to decay their statistics at a faster rate than larger ones.
If the decay rate were not inversely proportional to the memory size, time would advance inappropriately slowly for small memories and inappropriately quickly for large ones. The small cache would wait too long to respond to changes, and the large one would twitchily ``jump'' at brief changes in program behavior, which are likely not to persist long enough to be worth adapting toward.
Extensive simulation results show that this strategy works as intended: our adaptivity ensures that for any memory size, the cache responds to changes in the recent behavior of a program relatively quickly, so that it can benefit from relatively persistent program behavior, but not so quickly that it is continually ``distracted'' by short duration behaviors.
A single setting of the decay factor (relativized automatically to the memory size) works well across a variety of programs, and across a wide ranges of memory sizes.