Check out the new USENIX Web site. next up previous
Next: Sparse Interface Virtual Tables Up: Performance Enhancements Previous: Performance Enhancements

Bidirectional Object Layout

In this subsection, we propose a new object layout that optimizes the placement of reference fields to allow efficient gc tracing.

The Java heap is by definition a garbage collected area. A Java programmer has no control on the deallocation of an object. Garbage collectors can be divided into two major classes: tracing and non-tracing collectors. Non-tracing collectors (mainly reference counting) cannot reclaim cyclic data structures, are a poor fit for concurrent programming models, and have a high reference count maintenance overhead. For this reason, Java virtual machine designers usually opt for a tracing collector.

There exist many tracing collectors[22]. The simplest models are mark-and-sweep, copying, and
mark-compact. The common point to all tracing collectors (including advanced generational, conservative and incremental techniques) is that they must trace a subset of the heap, starting from a root set, looking for reachable objects. Tracing is often one of the most expensive steps of garbage collection[22]. For every root, the garbage collector (gc) looks up the type of the object to find the offset of its reference fields, then it recursively visits the objects referenced by these fields.

To provide efficient field access, it is desirable to place fields at a constant offset from the object
header, regardless of inheritance. This is easily
achieved in Java as instance fields can only be declared in classes (not in interfaces), and classes are restricted to single inheritance. Fields are laid out consecutively after the object header, starting with super class fields then subclass fields, as shown in Figure 3(a). When tracing such an object, the
garbage collector must access the object's class information to discover the offset of its reference fields, then access the superclass information to obtain the offset of its reference fields, and so on. As this process must be repeated for each traced object, it is quite expensive.

Figure 3: Traditional layout
\begin{figure*}\centerline{\ \psfig{figure=figures/naive.eps,width=160mm}}\end{figure*}

There are three improvements that are usually applied to this naive representation. Firstly, reference fields are grouped together in the layout of each class. Secondly, each class stores an array of offsets and counts of reference fields for itself and all its super classes. Thirdly, a one word bit array is used in the virtual table to represent the layout of reference fields in small objects (each bit being set if the object instance word, at the same index, is a reference). This is shown in Figure 3(b). For big objects, the number of memory accesses needed to trace an object is $n+3+(2*array size)$, where $n$ is the number of references. Two nested loops (and loop variables) are required: one to traverse the array, and one for each array element (accessing the related number of references). For smaller objects, the gc needs to access the virtual table to retrieve the bit field word, then it needs to perform a set of shift and mask operations to find the offset of reference fields. Overall, using this layout, tracing an object is a relatively complex operation.

Tracing reference fields could be much simpler if they were simply grouped consecutively. The difficulty is to group them while keeping the constant offset property in presence of inheritance.

We introduce a bidirectional object instance layout that groups reference fields while maintaining the constant offset property. The left part of Figure 4 illustrates this new layout. In the bidirectional object instance layout, the instance starting point is possibly a reference field. The instance grows both ways from the object header, which is located in the middle of the instance. References are placed before the header, and other fields are placed after it. The right part of Figure 4 illustrates the layout of array instances. Array element are placed in front or after the array instance header, depending on whether the element type is a reference or a non-reference type, respectively.

Figure 4: Bidirectional layout
\begin{figure*}\centerline{\ \psfig{figure=figures/instance.eps,width=160mm}}\end{figure*}

The object header contains two words (three for arrays). The first is a lock word and the second is a virtual table pointer. We use a few low-order bits of the lockword encode the following information:

We also use two words of the virtual table (see Figure 5) to encode the number of reference and non-reference field words of the object if the object is too big to encode this information in the header.

At this point, we must distinguish the two ways in which an object instance can be reached by a tracing collector. The first way is through an object reference that points to the object header (which is in the middle of the object). The second way is through its starting point, in the sweep phase of a mark-and-sweep gc, or in the tospace scanning of a copying gc. In both cases, our bidirectional layout allows the implementation of simple and elegant tracing algorithms.

In the first case, the gc accesses the lock word to get the number of references $n$ (one shift, one mask). If $n$ is the overflow value (big object), then $n$ is retrieved from the virtual table. Finally, the gc simply traces $n$ references in front of the object header.

In the second case, the object instance is reached from its starting point in memory, which might be either a reference field or the object header (if there are no reference fields in this instance). At this point, the gc must find out whether the initial word is a reference or a lock word. But, this is easy to find. The gc needs simply check the state of the last bit of the word. If it is one, then the word is a lock word. If it is zero, then the word is a reference.

So, for example, a copying collector, while scanning the tospace needs only read words consecutively, checking the last bit. When set to zero, the discovered reference is traced, when set to 1, the number of non-reference field words (encoded in the lock word itself, or in the virtual table on overflow) is used to find the starting point of the next instance.

In summary, using our bidirectional layout, a gc only accesses the following memory locations while tracing: reference fields and lock word, for all instances (objects and arrays), and at most three additional accesses for objects with many fields (virtual table pointer and two words in the virtual table itself).

While our work on bidirectional objects for grouping references is new, we mention some previous related work. The idea of using a bidirectional object layout (without grouping references) has been investigated[25,28] as a mean to provide efficient access to instance data and dispatch information in languages supporting multiple inheritance (most specifically C++). In [14], Bartlett proposed a
garbage collector which required grouping pointers at the head of structures; this was not achieved using bidirectional structs.


next up previous
Next: Sparse Interface Virtual Tables Up: Performance Enhancements Previous: Performance Enhancements
2001-02-27