The following paper was originally published in the Proceedings of the USENIX Conference on Object-Oriented Technologies (COOTS) Monterey, California, June 1995 For more information about USENIX Association contact: 1. Phone: 510 528-8649 2. FAX: 510 548-5738 3. Email: office@usenix.org 4. WWW URL: https://www.usenix.org Debugging Storage Management Problems in Garbage-Collected Environments David L. Detlefs and Bill Kalsow Digital Equipment Corporation Systems Research Center Palo Alto, CA 94301 {detlefs,kalsow}@pa.dec.com Garbage collection does not solve all storage management problems; programs allocate too much garbage, requiring excess collection, and may retain too much storage, causing heaps to grow too large. This paper discusses these problems and presents tools implemented in the SRC Modula-3 system that help solve them. 1. Introduction Many garbage collection enthusiasts, present authors included, have presented garbage collection as a panacea for all storage management problems. Like all marketing hype, this is something of an exaggeration. This paper discusses storage management problems that occur in garbage-collected systems, and describes some tools used in SRC Modula-3 [4][6] that aid in detecting and isolating such problems. 2. Problems with automatic storage management A garbage collector solves the two classic problems of explicit storage management: 1. "dangling pointers", where a block of storage is deallocated too early, while pointers to the block are still in use. If the block is reallocated, different parts of the program will be using the same region of memory for different purposes, with disastrous results. 2. "storage leaks", where blocks of storage are allocated but never deallocated. If this happens repeatedly in a long-running program, the program's memory requirements grow without bound. This problem is the converse of dangling pointers. Tools like Purify [7] help identify these problems, but further programming is necessary to solve them. However, when garbage collection is used, these problems never occur. If garbage collection solves these problems, then what could go wrong? More than enough, as we shall see. Excessive allocation may cause overly frequent collection. Moreover, even with collection, the heap may grow too large. There are a variety of causes for surprisingly large heaps: data structures that were designed without an upper bound on their size, references to ``dead'' heap objects that are hidden behind abstraction boundaries, and references hidden by the underlying compilation and runtime system. We consider each these problems in turn. 2.1 Excessive allocation If a program allocates a great deal of storage for short-term use, it creates a significant amount of garbage. That garbage must be collected; the more quickly garbage is created, the more often it must be collected. Generational techniques can help greatly in decreasing the cost of collecting such short-lived garbage. However, it is still possible to optimize the performance of most garbage-collected programs by locally reusing storage for the most frequently-allocated types, thereby avoiding garbage-collector overhead. Of course, such techniques have the same dangers as explicit storage management. 2.2 Unbounded data structures A garbage collector collects storage that is not reachable from the root set (the stacks and global variables) of the program. If the amount of reachable storage increases monotonically over time, a long-running program will still run out of memory, even with garbage collection. It is surprisingly common for programmers of long-running systems to create data structures that grow without bound. For example, programs often use caches to avoid redundant computation. If a program was originally used in a ``short-lived'' context, that is, it was used to compute a result and then exit, every result may have been cached. If this same program is converted for use in a ``long-lived'' server context, or is used on much larger input problems, then this strategy is unacceptable; a policy and mechanism for regulating the cache size must be added. This may sound obvious, but if the program is large and is developed by many programmers, it may be difficult to pinpoint all such data structures. Some may occur in unfamiliar libraries, perhaps written by third parties, and perhaps available only in object form. 2.3 References hidden by abstraction A data structure whose concrete state references a heap object while its abstract state does not may also cause the heap to grow too large. For example, consider the simple stack type whose interface and implementation are shown in figure 1. --------------------------------------------------------------------- INTERFACE RefStack; TYPE T <: Public; Public = OBJECT push(r: REFANY); pop(): REFANY; END; END RefStack. MODULE RefStack; REVEAL T = Public BRANDED OBJECT elems: ARRAY [0..99] OF REFANY; sp: INTEGER := 0; OVERRIDES push := Push; pop := Pop; END; PROCEDURE Push(self: T; elem: REFANY) = BEGIN self.elems[self.sp] := elem; INC(self.sp); END Push; PROCEDURE Pop(self: T): REFANY = BEGIN DEC(self.sp); RETURN self.elems[self.sp]; END Pop; BEGIN END RefStack. Figure 1: References hidden by abstraction. --------------------------------------------------------------------- The Pop procedure removes the top element pointer from the abstract stack. But note that the ``removed'' pointer remains in the concrete state of the stack; the elems array is not modified by Pop. So if we pushed 100 pointers to large graph structures, then popped them all, and then didn't use the stack again, the graph structures would be retained as long as the stack was; if the stack were a global variable, this would be the remainder of the program's lifetime. This kind of problem can be especially bothersome to pin down, since we are accustomed to thinking of our data types in abstract terms whenever possible. It is therefore necessary in a garbage-collected environment to modify such data structures to keep the references in the abstract and concrete states synchronized. In the example above, we would modify the Pop procedure to remove the concrete reference, as shown in figure 2. --------------------------------------------------------------------- PROCEDURE Pop(self: T): REFANY = VAR res: REFANY; BEGIN DEC(self.sp); res := self.elems[self.sp]; self.elems[self.sp] := NIL; RETURN res END Pop; Figure 2: Corrected version of Pop. --------------------------------------------------------------------- 2.4 References hidden by the system A similar problem can occur in places beyond the programmer's control. Consider an execution of a program with procedures, A, B, C, and D. A calls B, whose preamble reserves space on the stack for a local variable x, a pointer to a heap object X. B calls C, but saves on the stack the value of x, which had been in a register, on the stack, creating the situation shown in figure 3 (part a). C returns, and B returns, leaving the pointer in a dead area of the stack, as shown in figure 3 (part b). At this point there is no problem; if X is otherwise unreferenced, a collection could reclaim it. But A now calls D, which also allocates space on the stack to store a value. Before D stores anything into this location, however, a collection occurs -- perhaps D requested a heap allocation. The heap pointer stored by B is dead, but is located in the active area of the stack, as shown in figure 3 (part c). The object X, and all objects reachable from it, will be retained by the collection. --------------------------------------------------------------------- Figure 3: References hidden by the runtime-system. --------------------------------------------------------------------- --------------------------------------------------------------------- Figure 4: Shownew output (before). --------------------------------------------------------------------- We should note that this problem is probably not too important in single-threaded systems, since any pointer on the stack that causes storage to be retained in one collection is likely be overwritten by stack activity before the next collection occurs. However, in multi-threaded environments, the problem may be more serious. Imagine in the example above that procedure D, instead of triggering a garbage collection, waits on a condition variable that is rarely signalled. The thread executing D will be blocked for some time, perhaps for many garbage collections. Those collections will retain the object X. Note that this is not a problem confined to conservative collectors such as the collectors of Bartlett [1] or Boehm and Weiser [2], which assume any bit pattern in the stack that looks like a pointer is a pointer. Lisp systems using hardware tags are just as vulnerable; the pointer values in the stack locations are perfectly valid pointers --- they just aren't live at the time of collection. The Boehm-Weiser collector attempts to prevent this problem; it zeros the part of the stack above the stack pointer on each collection. A complete solution to this problem requires a great deal of cooperation between a garbage collector and a compiler. The compiler must either enable the collector to precisely determine which stack values and registers are live at the start of a collection, or generate code to ``NIL out'' pointer-containing stack locations on procedure entry or exit. Finally, we should note that while this problem may seem obscure, it does occur in practice. Retention of excess storage was traced to precisely this situation in the first version of the Vesta configuration management system [5]. We know of no certain fix for this problem other than requiring compilers to produce code in which procedures zero-fill their stack frames on entry (or, alternatively, on exit). The obvious performance penalties of these solution make them somewhat unattractive. 3. Tools This section describes four tools we have developed to aid programmers both find and fix storage management problems in long-lived garbage-collected programs. These tools are all implemented as part of the runtime system for SRC Modula-3. We give ``real-life'' examples of the use of each tool. These examples arise from problems encountered in the Extended Static Checking (henceforth ESC) program verification system being developed at SRC. 3.1 Diagnosing excessive allocation Excessive allocation causes frequent and potentially intrusive garbage collection. Shownew is a tool that allows a user to observe the allocation behavior of a program. Shownew is integrated into the runtime system, so that any SRC Modula-3 program can be passed a special command-line argument that will cause it to run under the control of a Shownew process. Shownew presents a bar graph indicating how much storage of each type is being allocated. A menu allows the user to indicate whether the graph should display the number of objects or bytes allocated, and whether the numbers should indicate totals since the beginning of the program, or only new allocations since the display was last updated. Shownew allows the programmer to pinpoint what types are being allocated most often and might be causing excessive collection. With Shownew, it is quite easy and almost always worthwhile to determine what types are allocated most frequently in your program. The answers are sometimes surprising, and occasionally represent bugs that are simple to correct. When the ESC system was found to be embarrassingly slow on a new example, we tried Shownew to see if excessive allocation was a problem. The result is shown in Figure 4. (Of course, the bars of the graph appear in color on a color monitor.) The identity of the type allocated most frequently was a complete surprise. ExprVarRefTbl.EntryList is an internal type used in the implementation of a set type provided by the Modula-3 library. These allocations were eventually traced to code that applied a variable substitution to a universally quantified formula (a ForAll), as shown in figure 5. The argument exc is a set of ``excluded'' variables, variables that are not to be substituted. Since a quantifier binds the quantified variables, ForAllSubst adds its own quantified variables, self.vs, to exc before applying the substitution to its body. --------------------------------------------------------------------- PROCEDURE ForAllSubst(self: ForAll; s: Subst.T; exc: ExprVarSet.T): ForAll = BEGIN RETURN MkForAll(self.vs, self.body.subst(s, exc.union(self.vs))); END ForAllSubst; Figure 5: Excessive allocation (before). --------------------------------------------------------------------- It happened that the type ExprVarSet.T was implemented using hash tables. The set type's union method is non-destructive, so exc.union(self.vs) makes a copy of exc, adds the elements of self.vs, and returns the copy. Note that the copy is discarded as soon as the call to MkForAll returns. --------------------------------------------------------------------- PROCEDURE ForAllSubst(self: ForAll; s: Subst.T; exc: ExprVarSet.T): ForAll = VAR res: ForAll; BEGIN exc := exc.unionD(self.vs); res := MkForAll(self.vs, self.body.subst(s, exc)); exc := exc.diffD(self.vs); RETURN res; END ForAllSubst; Figure 6: Excessive allocation (after). --------------------------------------------------------------------- We modified this procedure to avoid the copy, as shown in figure 6. The usual way to avoid this copy, and the one we used, is to substitute destructive operations (in which exc is modified) for functional operations (in which exc is not modified). The call exc.unionD(self.vs), for example, adds the elements of self.vs to the set exc, modifying its value. Similarly, diffD destructively deletes elements from a set. In this example, several assumptions are necessary to argue the correctness of the transformation. First, exc is not being accessed by any concurrently executing threads. Second, it is a precondition of ForAllSubst that exc and self.vs are disjoint. Third, the call self.body.subst(s, exc) does not retain a reference to exc. Using destructive operations often yields significant performance benefits, but the subtlety of the assumptions necessary to show them correct argues that they should be used sparingly. A tool like Shownew helps identify the most promising targets. --------------------------------------------------------------------- Figure 7: Shownew output (after) --------------------------------------------------------------------- The first picture in figure 7 shows the result of Shownew on this program after the change described above. Note that the EntryList type is now only the sixth of the most frequently allocated types; the number of EntryList's allocated decreased by more than 95%. Similar transformations resulted in the situation shown in the second picture of that figure, where the absolute numbers of the most frequently allocated types have dropped dramatically. The implementation of Shownew is fairly simple. Each (garbage-collected) heap-allocated object in Modula-3 has a header that contains a typecode, a unique integer corresponding to the dynamic type of the object. From a typecode it is easy to find the name, size, and various other attributes of the corresponding type. A NEW(T) expression in the source is compiled to a call to the runtime's allocation routine with the typecode of T as an argument. When Shownew is being used, the allocator counts the objects allocated of each type, periodically forwarding that information to the Shownew process. 3.2 Excess retained storage The previous section dealt with allocation. Allocation and heap size are obviously related, since something must be allocated to become part of the heap, but they are also quite different. A type that is allocated often may contribute nothing to the heap size, if all instances quickly become garbage and are collected, while a type that is allocated infrequently, but whose instances are retained, will cause the heap to grow without bound. The rest of this section introduces three tools used to diagnose problems in the theorem-prover used in ESC. In long-running proofs, the theorem-prover's heap seemed to be growing continually larger, while we expected the storage requirements of the systems to quickly reach a steady state. 3.2.1 RTutils.Heap The first tool we used was the procedure RTutils.Heap, which reports the composition of the heap by data type. If we explicitly run a collection before generating such a report, we obtain a breakdown by type of live objects in the heap. RTutils.Heap has several options that allow the report to be ordered by number of allocated objects or bytes occupied, and to limit the report to the top n types by the requested ranking method. We modified ESC to periodically perform a garbage collection and call RTutils.Heap, reporting the top 10 types ranked by bytes occupied. Figure 8 shows these reports near the beginning (but after we would have expected the heap size to stabilize) and end of an execution of the theorem-prover on a relatively small test case. --------------------------------------------------------------------- Near beginning: Code Count TotalSize AvgSize Name ---- --------- --------- --------- --------------- 270 1 2457616 2457616 Enode.UndoStack 180 3352 616768 184 Enode.Parent 173 14623 350952 24 RefList.T 230 1 160024 160024 Simplex.RatArr2 294 3804 121728 32 SigTab.EntryList 233 4 81984 20496 143 1857 59424 32 Context.Literal 178 411 59184 144 Enode.Leaf 161 473 49192 104 MatchingRule.Rule 196 19 48216 2537 RTHooks.CharBuffer --------- --------- 33714 4505664 Near end: Code Count TotalSize AvgSize Name ---- --------- --------- --------- --------------- 180 16395 3016680 184 Enode.Parent 270 1 2457616 2457616 Enode.UndoStack 294 14542 465344 32 SigTab.EntryList 173 14611 350664 24 RefList.T 233 4 311360 77840 230 1 160024 160024 Simplex.RatArr2 178 852 122688 144 Enode.Leaf 232 1 64016 64016 Context.TritArray 143 1790 57280 32 Context.Literal 161 503 52312 104 MatchingRule.Rule --------- --------- 59951 7684096 Figure 8: Use of RTutils.Heap --------------------------------------------------------------------- The heap has grown by almost three megabytes. The bulk of that increase can be attributed to the type Enode.Parent, which grew by more than two megabytes. The addition of calls to RTutils.Heap to the source code has helped us identify the first-order term of our problem: we are retaining more Enode.Parent objects than we expected. These reports are useful enough that we left this debugging code in permanently, with printing of the reports controlled by an environment variable. The implementation of RTutils.Heap requires the ability to enumerate all the objects in the heap and classify them by type. Recall that Modula-3 heap objects are prefixed by headers containing a typecode that uniquely designates a type. It is therefore simple to examine all heap objects, constructing a table mapping typecodes to object counts and bytes occupied. Finally, the table is sorted by the appropriate metric and printed. Two additional features of RTutils.Heap merit explanation. First, sometimes a report by typecode is too coarse-grained. For some problems, it is convenient to program in a Lisp-like style within Modula-3. Instances of the type RefList.T are singly-linked lists of generic pointers. This type can be used much like a Lisp cons cell. In programs that use such a style, lists are used in a number of different ways. A report saying that the heap contains many RefList.T's may not identify which of the many uses of the type is responsible for the retained storage. In such a situation, the programmer can use the RTAllocStats interface. This interface exports a procedure EnableTrace, which takes a typecode argument. Once EnableTrace is called for a type, the header of each newly allocated object with that type is annotated with a small integer representing the call site of the allocation. Abstractly, a call site can be thought of as a snapshot of the stack at the time of the allocation. Practically, it is the sequence of the top n program counter values on the stack, where n defaults to 3 but can be set by the user. The single program counter of the actual allocation is insufficient; in the RefList case mentioned above, few RefList.T's are allocated directly by clients of the interface; most are allocated using convenience procedures of the interface like RefList.Cons or RefList.List3. It would be of little use to discover that most lists were allocated within calls to Cons. Figure 9 shows an excerpt from one of these finer-grained reports. Call-site tracking requires more from the runtime system. The allocation routine must first determine whether tracking is enabled for the type being allocated. If so, it must determine the call site, which requires the ability to interpret a thread stack's contents at runtime; note that this code is highly platform-specific. Because we use ``unused'' bits in the standard object header to record call sites information, we are currently limited to 256 sites per type. If this limit is reached, allocations at other sites are credited to a site code reserved for other sites, as shown in figure 9. --------------------------------------------------------------------- Code Count TotalSize AvgSize Name ---- --------- --------- --------- -------------------------- 272 1 2457616 2457616 Enode.UndoStack 173 15416 369984 24 RefList.T 3436 82464 24 Cons + 16_3c in RefList.m3 SubWork + 16_59c in PredSx.m3 SubWork + 16_4dc in PredSx.m3 2333 55992 24 Cons + 16_3c in RefList.m3 CopySx + 16_130 in Clause.m3 CopySx + 16_c4 in Clause.m3 ... 465 11160 24 OTHER SITES ... Figure 9: RTutils.Heap output broken down by call site. --------------------------------------------------------------------- The second additional feature of RTutils solves the opposite problem: when reports by typecode are too fine-grained. For example, ESC processes Modula-3 using M3TK [3], which translates the source code into an abstract syntax tree (AST) that ESC then analyzes. These AST's are made up of many different types; it would be tedious to determine how much space is occupied by such trees from the kind of report described above. However, the types of the tree nodes are organized in a type hierarchy rooted at a generic AST.NODE type. RTutils.Heap produces a second report for Modula-3 object types that reflects the inheritance hierarchy. Since Modula-3 supports only single inheritance, this hierarchy is a simple tree. In figure 10 it becomes clear that AST nodes occupy about 1.2 megabytes, with different subtypes accounting for different fractions of that total. --------------------------------------------------------------------- Code Count TotalSize AvgSize Name ---- --------- --------- --------- -------------------------- ... 311 24866 1223512 49 AST.NODE ... 510 964 49208 51 M3AST_AS.STM ... 516 254 12192 48 M3AST_AS_F.Assign_st ... 527 39 2808 72 M3AST_AS_F.For_st ... Figure 10: RTutils.Heap output broken down by the type hierarchy. --------------------------------------------------------------------- The restriction to single inheritance makes the implementation of this report simple. Typecodes are assigned to types in a pre-order traversal of the type forest, so that the typecodes of the subtypes of a type form a consecutive sequence of integers. The subtype test is therefore a simple range test, making it easy to aggregate the numbers for a type and all its subtypes. In summary, RTutils.Heap answers some of the most basic questions that must be answered when debugging a storage management problem: how big is the heap, what kind of objects are in it, and how many objects of each type are in it? It can answer these questions at finer or coarser grains where necessary. 3.2.2 RTHeapStats.ReportReachable RTutils.Heap identified what was filling up the heap, but did not tell us why the objects filling the heap were escaping collection. The next tool in our kit helps answer that question. An object is retained by garbage collection only if it is reachable from a root of the program, that is, from the stacks, registers, or global variables. The procedure RTHeapStats.ReportReachable tells us how many bytes of storage are reachable from the roots, breaking down the roots in various ways. Each global variable in Modula-3 is associated with some module. One list ranks the modules by bytes reachable from their global variables. A more detailed breakdown ranks the individual global variables by the amount of storage they reach. A second section details storage reachable from thread stacks, again, reporting both at coarse and fine grains. In the coarse grained report, entire stacks are lumped together. In the fine grained report, stack frames are printed along with the storage reachable from individual references contained in those frames. We modified ESC to call ReportReachable periodically. Figure 11 shows excerpts from two reports, one near the beginning of the program, but after we expected the heap size to reach a steady state, and one from near the end. --------------------------------------------------------------------- Near beginning: HEAP: 0x14009a000 .. 0x140c98000 => 11.9 Mbytes Module globals: # objects # bytes unit --------- -------- ----------------- 24471 4151816 Enode.m3 11116 698648 Context.m3 8390 579608 Clause.m3 ... Global variable roots: # objects # bytes ref type location --------- -------- ---------- ----------------- ------------------ 8450 3036328 0x140694008 Enode.UndoStack Enode.m3 + 2224 9065 1098920 0x1402a2558 SigTab.Default Enode.m3 + 1352 12925 703144 0x1400a7990 IntRefTbl.Default Enode.m3 + 4328 ... Near end: HEAP: 0x14009a000 .. 0x141244000 => 17.6 Mbytes Module globals: # objects # bytes unit --------- -------- ----------------- 43159 6534144 Enode.m3 11194 757936 Context.m3 5215 748008 Simplex.m3 ... Global variable roots: # objects # bytes ref type location --------- -------- ---------- ----------------- ------------------ 27149 3456264 0x1402a2558 SigTab.Default Enode.m3 + 1352 8632 3100256 0x140694008 Enode.UndoStack Enode.m3 + 2224 13471 727832 0x1400a7990 IntRefTbl.Default Enode.m3 + 4328 ... Figure 11: Use of RTHeapStats.ReportReachable. --------------------------------------------------------------------- Comparing the reports identifies the Enode module as the major offender. We see also that an object of type SigTab.Default in the Enode module went from reaching just over one megabyte to almost three and one half. The only global variable of this type in the module was named sigTab. Once we had identified sigTab as a possible problem, a moment's thought was sufficient to reveal our mistake. The purpose of sigTab is to ensure that we never create distinct Enode.Parent's with identical children. For small examples it was acceptable to remember all the parents ever created. For large examples, however, such unbounded growth leads to overly large heaps. This table is a cache of parent nodes; we needed to invent a cache-management policy for it. We can delete a Parent from the table when it is no longer in use. Because of its backtracking search structure, the prover ``unwinds'' every action it takes; so we can delete a Parent from the table when we undo the action that added it in the first place. We will briefly describe the implementation of ReportReachable before continuing with our detective story. Essentially, each line item in the report is the result of a mini-garbage-collection. We allocate a new bit vector large enough to serve as mark bits for all the objects in the heap. To determine the set of objects reachable from some subset r of the roots, we clear the mark bits, then (recursively) mark each object reachable from r, using the mark bits to terminate the recursion. When the mark phase is complete, we count the objects and bytes corresponding to the mark bits. This algorithm requires the ability to enumerate the pointer-containing fields of an arbitrary heap object. Note also that presenting the result of thread stack traversals requires the same run-time interpretation of thread stacks needed for call-site tracking in section 3.2.1. Note that this process is potentially quite expensive. If a heap contains a large linked data structure that is reachable from many roots, that structure will be traversed and counted once for each of the different root references. In practice, however, the performance of ReportReachable seems quite acceptable; the reports shown above did not slow down the the execution of the program greatly. 3.2.3 RTHeapDebug In the ESC problem we have been discussing, it was fairly easy to identify a problem once RTHeapStats.ReportReachable led us to an offending global variable. Sometimes, however, this information may not be sufficient; it may be difficult to determine what paths exist from the offending root to objects of the problematic type. The RTHeapDebug interface helps find such paths. Even in a garbage-collected system, it is sometimes possible to identify a point in the program where one expects an object to become garbage. In our ongoing example, we added code to remove an Enode.Parent from sigTab when the prover's search backtracked past the point where the parent was first created and added to the table; we expected the parent to be unreachable after it was deleted from the table. RTHeapDebug allows the programmer to check such expectations. Calling RTHeapDebug.Free(r) asserts that the reference r is unreachable. A call to RTHeapDebug.CheckHeap does the equivalent of a garbage-collection, traversing all reachable objects in search of any that have been asserted unreachable. If such an object is found, CheckHeap prints a path from a root to the putatively unreachable object. In the ESC example, we called RTHeapDebug.Free to assert that Enode.Parent's removed from sigTab by the undo code where unreachable. We called CheckHeap at a later point. We were surprised to find that our assumption was wrong. Figure 12 shows an excerpt of a report from CheckHeap. --------------------------------------------------------------------- Path to 'free' object: Ref in root at address 0x14000a770... Object of type Enode.UndoStack at address 0x14013c008... Free object of type Enode.Parent at address 0x14013eac8... Path to 'free' object: Ref in root at address 0x14000a770... Object of type Enode.UndoStack at address 0x14013c008... Free object of type Enode.Parent at address 0x14013eae0... Path to 'free' object: Ref in root at address 0x14000a770... Object of type Enode.UndoStack at address 0x14013c008... Free object of type Enode.Parent at address 0x14013eaf8... ... Figure 12: Use of RTHeapDebug.CheckHeap. --------------------------------------------------------------------- There was only one variable in the program of type Enode.UndoStack; it is the stack on which undo records are written to implement the backtracking search. Apparently this stack was pointing directly to objects we thought should be unreachable. A little investigation revealed the problem to be a case of references hidden by abstraction, almost exactly the situation described in the example of section 2.3. We considered the undo stack as the portion below the stack pointer, but the garbage collector considered the entire array data structure. Undo records above the stack pointer contained pointers to otherwise unreachable objects. To solve the problem we modified the pop operations of the various undo stacks in the system to set pointers in dead stack entries to NIL. This investigation has purged ESC of its most egregious storage leaks. It might be argued that the technique embodied by RTHeapDebug is regressive, in that it requires the programmer to do the equivalent of explicit storage management. There is some truth to this argument. However, RTHeapDebug.Free is only needed for types identified as storage problems, not for all deallocated objects. Calling RTHeapDebug.Free for an object that is still accessible causes a graceful error message, while actually deallocating a still-accessible object is likely to cause a confusing dangling pointer bug. The implementation of RTHeapDebug uses the WeakRef interface provided by the Modula-3 runtime. A WeakRef.T is a reference to an object that does not prevent the object from being collected. WeakRef provides a routine for obtaining the ``real'' reference corresponding to a WeakRef.T; this routine returns NIL if the referenced object has been collected. RTHeapDebug maintains a set of weak references to freed objects, initially empty. RTHeapDebug.Free(r) adds a weak reference corresponding to r to this set. RTHeapDebug.CheckHeap first throws away any elements in the set whose referents have been collected, then searches for paths to the remaining freed objects. 4. Conclusion We present four classes of storage-management problems that occur even in completely garbage-collected systems: excessive allocation, data structures that grow without bound, references hidden by abstraction boundaries, and references hidden by the system. We have presented tools that aid the diagnosis of such problems. These tools are part of the SRC Modula-3 system, but could easily be adapted for other languages; in fact, some of the tools are not specific to garbage-collected systems. We feel that the trend towards long-lived server programs will speed the adoption of garbage-collected systems; the likelihood of pointer-smashes or storage leaks in explicitly managed systems is far too high for applications that must run reliably for long periods. This paper addresses the (more manageable!) storage management problems that remain once such systems have adopted garbage collection. References [1] Joel F. Bartlett. Compacting garbage collection with ambiguous roots. Technical Report 88/2, Digital Equipment Corporation Western Research Laboratory, February 1988. [2] Hans-Juergen Boehm and Mark Weiser. Garbage collection in an uncooperative environment. Software Practice and Experience, 18(9):807-820, September 1988. [3] Mick Jordan. An extensible programming environment for Modula-3. In Proceedings of the Fourth ACM SIGSOFT Symposium on Software Development Environments. ACM, ACM Press, 1990. [4] Bill Kalsow. SRC Modula-3 home page. URL https://www.research.digital.com/SRC/modula-3/html/home.html. [5] Roy Levin and Paul Mcjones. The Vesta approach to precise config- uration in large software systems. Technical Report SRC-102, Digital Equipment Corporation Systems Research Center, 130 Lytton Ave., Palo Alto, CA 94301, June 1993. [6] Greg Nelson, editor. Systems Programming in Modula-3. Prentice- Hall, Englewood Cliffs, NJ, 1991. [7] Pure Software, Los Altos, California. Purify Version 1.1 Beta A, 1992. User manual.