Check out the new USENIX Web site. next up previous
Next: GNU Emacs and Its Up: Hummingbird and Its Dynamic Previous: Hummingbird's Memory Objects

Hummingbird's Dynamic Memory Activity

We instrumented Hummingbird to generate a trace of its dynamic memory activity. The trace we studied was captured from a Hummingbird run corresponding to the processing of an HTTP proxy log of four days. The trace contains about 58 million events (dynamic memory allocations and deallocations). These events correspond to the dynamic memory activity of Hummingbird when it was processing 4.8 million HTTP requests. Those requests contain 14.3 GB of unique data and 27.6 GB total data.

The total amount of live memory in Hummingbird was fixed at about 217 MB. Note that many memory allocation events in the trace have no corresponding deallocation events. These events correspond to the allocation of memory objects that stayed in memory when the run ended.

Figure 1 depicts the distribution of Hummingbird's object sizes against their frequency. The left portion of the graph, which corresponds to objects less than 128 bytes in size, indicates that there is a large number of fixed-sized objects holding meta-data, and these objects are only of a few fixed sizes. The graph also shows that there is a large number of object sizes longer than 128 bytes. These are the variable-sized objects holding the file data. There are no object sizes which are especially common, except for 32 KB, which is the cluster size. A unique feature of the graph is that the distribution of objects larger than 128 bytes is represented by almost a straight line on the $\log-\log$ scale. This feature suggests that the distribution of the variable-sized objects is Zipfian [10].

In a Zipf distribution, the frequency of occurrences of an event is inversely proportional to its rank $r$ using the formula $\frac{1}{r^a}$, where $a$ is close to unity. In other words, the relative frequency of the most common event is 1, since its rank is also 1. The relative frequency of the $n$th most common event is $\frac{1}{n^a}$, since its rank is $n$.

Table 1 shows the most common Hummingbird object sizes, their relative frequency, and the fraction of the memory allocated to objects of this size. The ``total size allocated'' column is the total amount of memory allocated for objects of this size (or size range) during the entire run. Note that more than half of the total allocated memory was for objects of size 32 KB, which is the Hummingbird cluster size. Since most disk accesses (reads and writes) are to full clusters, objects containing clusters are created and deleted frequently.

Figures 2 and 3 depict the lifetime of objects. The object lifetime is presented in two metrics: the average amount of new dynamic memory allocated during the lifetime of the object (Figure 2), and the average number of new dynamic memory objects allocated during the lifetime of the object (Figure 3). Both figures show that objects larger than 128 bytes and less than 32 KB are very long-lived, that is, an absence of locality in the patterns of sizes of deallocations and requests for new chunks of memory.

Figure 1: Distribution of Hummingbird's object sizes.
\includegraphics{frequency.eps}

Figure 2: Memory allocated during the lifetime of a Hummingbird memory object as a function of its size.
\includegraphics{lifeplot.eps}


Table 1: The distribution of the most frequent object sizes in Hummingbird. The table is sorted by decreasing frequency. Object sizes are always a multiple of double word (8 bytes).
object % of total % of total
size objects size
(bytes) allocated allocated
8 25.813 0.118
32 23.775 0.436
24 9.848 0.136
56 7.518 0.242
16 7.056 0.065
48 6.604 0.182
64 3.705 0.136
40 3.196 0.073
32768 2.975 55.935
72 1.588 0.066
80-120 0.911 0.045
128-32760 6.647 20.197
32776-39966072 0.364 22.370
total 100.000 100.000


From the above discussion, we can observe that Hummingbird's dynamic memory activity is very different from the kind of activity that memory allocators are designed and optimized for. Most studies conclude that there are a small number of distinct dynamic object sizes in real programs. Johnstone and Wilson [2] say that $99.9\%$ of the objects are of just 141 sizes! However, Hummingbird allocates a very large number of object sizes (more than 18000). Many mallocs assume that there is a strong temporal correlation between allocation and deallocation sizes. In other words, an allocation is likely to specify the memory size of a recently deallocated object. However, Hummingbird's dynamic memory activity has little correlation between the recently freed objects and new allocation requests.


next up previous
Next: GNU Emacs and Its Up: Hummingbird and Its Dynamic Previous: Hummingbird's Memory Objects