The following paper was originally published in the Proceedings of the USENIX Conference on Object-Oriented Technologies (COOTS) Monterey, California, June 1995 For more information about USENIX Association contact: 1. Phone: 510 528-8649 2. FAX: 510 548-5738 3. Email: office@usenix.org 4. WWW URL: https://www.usenix.org Program Explorer: A Program Visualizer for C++ Danny B. Lange and Yuichi Nakamura IBM Research, Tokyo Research Laboratory 1623-14, Shimotsuruma, Yamato-shi Kanagawa-ken 242, JAPAN e-mail: danny@acm.org Abstract. Despite the obvious advantages of using object-oriented (O-O) program visualizers in system understanding and debugging, they are still rarely found in the programmers's tool box. One reason for this that such visualizers often fail because of their inability to handle problems of a realistic scale. In our research, we have addressed the scalability problem by integrating static and dynamic program information to produce abstract and yet accurate views of complex O-O systems that often provide more useful information than can be obtained by reading the source code. This is the approach we followed in designing Program Explorer, a research prototype for C++ program visualization, which has been used to examine large O-O systems such as Stanford's Interviews library and Taligent's CommonPoint frameworks. 1. Introduction Whether we want to re-use or debug an object-oriented (O-O) system, we must acquire a thorough understanding of the static and dynamic properties of the system. Although the use of object orientation has changed software development for the better, it has not exactly made programs easier to understand. Inheritance, polymorphism, and encapsulation are all good O-O concepts, but also tend to make the actual designs in which they are used more difficult to comprehend. Inheritance makes it difficult to ``read'' the behavior of a particular object, since that object can belong to a chain of, say, five to ten classes; polymorphism often makes it difficult to determine which method is actually executed in a given object; and encapsulation makes it difficult to understand that no object is an island, but rather that each is a part of a cooperative network of objects. In fact, understanding O-O systems is the art of combining knowledge about the concrete (objects and their interaction) and the abstract (classes and their relationships). One cannot fully understand an O-O system by simply considering its concrete or abstract properties in isolation. The former consist of the visible results of the execution of the system, while the latter consist of what we can expect from the system. Understanding evolves from a knowledge of the relations between these two sets of properties. The process of understanding programs is notoriously difficult and cumbersome. Often many different classes are involved and interaction turns out to be non-trivial. The complexity of both the design and some O-O concepts frequently makes it difficult to verify the scenarios we construct with pen and paper. Tools such as CIA++ [Grass92] and GraphLog [Consens92] have been developed to help in understanding O-O programs. The problem, however, is that information given by this type of tool for real-world programs is sometimes very difficult to comprehend. There are very few ways of distinguishing relevant information from less relevant information. The root of the problem is that these tools more or less display the obvious: an unfiltered graphical representation of the source code that is just as difficult to comprehend as the source code itself. Our approach is to combine program execution with the understanding of objects and interactions, and the static program information (source code) with the understanding of classes and their relationships. This allows us to determine which classes are relevant and how program behavior changes according to our interaction with the running program. The challenge of this approach is that dynamic analysis of O-O systems involves generating huge amounts of program information that is hard to digest, especially if the information is presented in a purely textual format. The situation is not unique. Other scientific areas characterized by huge amounts of data face the same problem. One solution that has become increasingly popular is called scientific visualization. It has in many cases proven to be a particularly good way of presenting a large amount of information [Nielson90]. Two O-O program visualization tools that utilize dynamic information are Object Visualizer [Pauw93,Pauw94] and HotWire [Laffra94]. Both rely on visual effects to draw attention to program anomalies, rather than giving exact information. Although this approach appears efficient for detecting problems and, to some extent, for localizing them, it hardly provides the exactness that programmers need in order to understand a problem and correct it. In our research, we have developed a mechanism capable of amplifying patterns of object interaction in program visualizations. Our approach is based on the observation that static information can leverage dynamic information and vice versa. The technique is to let the dynamic information impose (1) a sense of relevance on the static entities, such as which classes or member functions are actually important during a certain phase of execution, and (2) a sense of sequence, such as the order in which member functions are called. In contrast, static information allows us to focus on (1) objects of certain classes, and (2) properties related to certain classes. Hence, the two main issues that have to be addressed in order to visualize O-O programs are the availability of program information (capturing static as well as dynamic information), and the scalability of information processing and presentation (that is, the ability to create useful visualizations of real-world systems). We have addressed both these issues in the development of a research prototype for program visualization named Program Explorer. Program Explorer is a system for understanding C++ programs through visualization. Its purpose is to provide class- and object-centered views of the structure and behavior of large C++ systems, with information accurate enough to enable programmers to re-use and maintain undocumented parts of such systems. The tool should be graphically oriented and should provide interactive hypertext-like navigation of program entities in running programs. In the following section, Program Explorer's way of coupling static and dynamic program information is described and some visualizations are displayed. Section 3 presents the Program Database, the source of static program information. Sections 4 and 5 describe how the dynamic program information is generated and retrieved. Section 6 presents our conclusions. --- Figure 1: Class-to-Object Coupling: Vertical Selection. 2. Information Coupling and Visualization From the Program Database, Program Explorer can retrieve static Status: RO program information for a number of interesting visualizations such as class inheritance hierarchy, function calls, and variable access. From the execution trace of a running system it can generate a number of visualizations that can give insights into its dynamic properties, such as object creational hierarchy, object calls, and object usage. It is thus possible to investigate the exact sequences of interaction among several objects needed to accomplish some task, and to detect program anomalies such as objects being created but never destroyed, and objects being invoked after they have been destroyed. However, in our experience the main problem with pure static and pure dynamic models is their limited applicability to systems of a realistic size. The reason for this is that, in a sense, pure static views as well as pure execution views just show the obvious. That is, they show what is already present in the source code and in the computer memory during program execution. What helps in program understanding is the coupling of the abstract and the concrete, that is, of static and dynamic information. In Program Explorer we have found two coupling techniques that are particular useful, namely, Classes-to-Objects and Objects-to-Classes. --- Figure 2: Class-to-Object Coupling: Horizontal Selection. The Classes-to-Objects coupling can be used to filter dynamic information by means of static information. In Program Explorer, we have defined two particular categories of selection based on the class inheritance graph: vertical and horizontal selection. By vertical selection we mean the process of selecting objects of a given class. By horizontal selection we mean the selection of certain object properties related to a specific class. Often, we use vertical selection to select a group of objects, and then use horizontal selection to study particular aspects of their interaction. We have created an Interviews [Linton92] sample program to examine the use of flyweight objects in Interviews (see Appendix A and Figure 7. We know that ivGlyph is the root class for all graphical objects; thus, by retrieving objects derived from this class we can examine all graphical objects produced by this sample program. Figure 1 demonstrates such use of vertical selection. In the right pane we see the Object Graph of graphical objects and their creators. The nodes in the graph are objects identified by their class name and a unique number (World<1> represents the un-instrumented creators). The arcs represent creational relationships. In the left pane we find the Invocation Chart, which displays object longevity (the lengths of the bars) and the order of creation (from top to bottom). --- Figure 3: Moving Focus to ivTransformSetter. Following the vertical selection, in Figure 2 we show an example of horizontal selection. The class ivGlyph defines a virtual function draw that is implemented in each of the derived classes. By selecting ivGlyph::draw we focus on a single behavioral aspect of these objects. The hierarchical drawing process from window to canvas becomes very clear, particularly in the invocation chart. It also shows how the label, as a flyweight object, is reused and drawn in three different contexts: plain, with a shadow, and transformed (rotated). By Objects-to-Classes coupling we mean the transfer of dynamic information to the static domain. The purpose of this coupling is to filter huge amounts of static information in such a way that only the relationships actually used are in focus. Such visualizations can be used to express class-proximity (quantification of dynamic information to determine how dependent classes are on each other), or they can be used in class-based diagrams to describe object communication by numbered call relationships among classes, indicating the order in which invocations are supposed to take place. Surprisingly informative visualizations can be obtained by applying static and dynamic information coupling to large systems. --- Figure 4: View Integration: Four Views of ivLabel. Orthogonal to the static and dynamic representations of O-O program executions are their visual representations. The purpose of a visual representation is to communicate the content of a given view of an O-O program and its execution. Different visual representations have different characteristics that make them more or less suited for particular views. In our research we have implemented and studied a number of interesting visual representations including graphs, bar charts, and matrices. The use of color has also been explored. Our color convention (which can be re-defined by the user) is red for objects, green for classes, blue for free functions, and brown for the un-instrumented part of the system. Highlighting is used to display selected entities. Two interaction techniques are used in Program Explorer's GUI to deal with the issue of scalability. Navigation is the technique used within a given view to explore it in an step-by-step fashion comparable to that of hypertext linking. In the previous example, we can move the focus to the ivShadow and ivTransformSetter objects to examine how a label is actually given a shadow and how it is rotated. In Figure 3 we have moved the focus to one of the above objects, and we can see how ivTransformSetter actually instructs the canvas to rotate the label object without the latter's knowledge. Navigation allows the user to focus on certain parts of a graph while ignoring others. --- Figure 5: The System Architecture of Program Explorer. A focus can also be exported to another view. As is shown in Figure 4, Program Explorer's GUI consists of four panes, of which three are graphical and one is textual. The integration of these four panes through the focusing mechanism enables the user to view static properties in one pane and dynamic properties in another. We regard this as a GUI-based coupling of static and dynamic information. 3. Static Program Information Retrieving static program information for C++ programs is strictly speaking not a part of Program Explorer. It is retrieved by the Program Database from so-called pdb-files generated by IBM's xlC compiler. Table 1: Schema for Static Information. Facts on entities: pd_directory(ID, Name, PathName, ParentDirID) pd_file(ID, Name, PathName, Time, Language, DirID) pd_class(ID, Name, ClassType, IsSOM, SOMName) pd_function(ID, Name, PrototypeString, FuncStorageClass, Const, Inline, Overload, Operator, VirtualSpecifier, FuncMiscAttriute, Volatile, IsSOM, SOMName) pd_variable(ID, Name, DeclarationString, VarStorageClass, Const, Volatile) pd_enumeration_tag(ID, Name) pd_enumerator(ID, Name, EnumID) pd_macro(ID, Name, NumberOfArguments) pd_typedef(ID, Name, DeclarationString) pd_label(ID, Name) pd_template(ID, Name, TemplateType, LongName) Facts on relationships: pd_defined(ID, ScopeID, FileID, Line, Column) pd_declared(ID, ScopeID, FileID, Line, Column) pd_used(ID, ScopeID, FileID, Line, Column) pd_used_implicit(ID, ScopeID, FileID, Line, Column) pd_used_lvalue(ID, ScopeID, FileID, Line, Column) pd_member(ClassID, MemberID, AccessSpecifier, Offset) pd_friend(ClassID, FriendID, Line, Column, FileID) pd_inherit(DerivedClassID, BaseClassID, AccessSpecifier, Virtual, Order, FileID) pd_include(FileID, IncludedFileID, Line) pd_instantiated(InstantiatedID, TemplateID) pd_source2pdb(SourceFileID, PdbFileID, CompilerOption, Language) pd_call(CallerID, CalleeID) Facts on types of names: pd_typeof_variable(ID, Type) pd_typeof_function(ID, ReturnType, NumberOfArguments, ListOfArgumentTypes) pd_typeof_typedef(ID, Type) An overview of the system architecture of Program Explorer can be seen in Figure 5. The system includes a program database for C++, an instrumentation utility for augmenting C++ programs with code that produces trace information, a Trace Recorder linked to the instrumented programs that captures the trace, and finally, Program Explorer, which controls program execution and presents static and dynamic information through its GUI. The Program Database is a stand-alone application that implements the schema of static program information from Table 1 and provides a full Prolog interface to clients. A simple query pd_class(Cid, Cname, class,_,_)? will return a set of pairs of (ClassID, ClassName). The query can be extended to pd_inherit(DerID, BaseID,_,_,_,_), pd_class(DerID, DerName, class,_,_), pd_class(BaseID, baseName, class,_,_)? which returns the set of binary inheritance relationships between classes. The built-in Prolog interpreter can also compute recursive queries, and the uniform representation of facts and rules in Prolog allows clients to append rules to the database and thus create their own logic programs as a part of the database. 4. Program Instrumentation Collecting accurate information on a running C++ program is not easy. Without a meta-class concept, there are basically three ways of collecting such information: (1) extend the compiler so that it adds trace-generating code to the normal code, (2) use debugging techniques, or (3) augment the source code with trace-generating code. Many compilers offer options for generating profiling information that can be used to detect ``hot spots'' in the code. Traces of this type do not provide enough information for our kind of visualization. Ideally, compilers can be modified to insert code that produces detailed trace information, but we considered that task to be beyond the scope of our work. Another way of retrieving dynamic information is to use debugging techniques. Normally, this is done by setting breakpoints at the functions in a program. However, when dynamic information is collected in this way, a great deal of time is spent by the system in context switching between the process of the target program and the process of Program Explorer. Processes are heavyweight processes in AIX and a context switch from process to another is a costly operation. The third solution is to augment the source code with trace-generating code. This technique is often termed program instrumentation. C++ preprocessors have been used to instrument programs [Pauw93], but a complete instrumentation would require a semantic analysis comparable to the one carried out in the compiler. We found instrumentation to be an acceptable solution, but instead of using a C++ preprocessor we decided to rely on the Program Database for accurate instrumentation information. The contents of the Program Database are compiler-generated and thus very exact with respect to the semantics of program entities and their physical location. Instrumentation. Program Explorer provides selective instrumentation on a class-wise basis. The GUI of Program Explorer allows the user to specify which classes should be instrumented. A directory-file-class hierarchy allows flexible selection of whole directories, files, and classes. This technique is suitable for avoiding trace information from highly active classes and from classes that are trivial or already well understood. We intend to extend this technique to encompass functions as well as variables. Our aim has been to instrument C++ programs to produce complete and accurate trace information. For that purpose it is necessary to capture events related to object longevity, function invocation, and variable access. Program events are captured by the Trace Recorder through the internal protocol (see Figure 6 and Table 2. The Trace Recorder processes events to produce a trace, which it also stores. Program execution is controlled and trace information is queried by Program Explorer through the Trace Recorder's external interface. --- Figure 6: The Executable: Program and Trace Recorder. Table 2: Internal Protocol for the Trace Recorder. Command Arguments Longevity Allocate ObjectID ClassID MemoryAddr Deallocate ObjectID Invocation Construct ObjectID ClassID FunctionID Destruct ObjectID ClassID FunctionID Enter ObjectID ClassID FunctionID Leave ObjectID ClassID FunctionID Usage ObjectID VariableID RetrFunc ObjectAddr It should be emphasized that the instrumentation code shown below is in no way specific to IBM's xlC compiler. The code fragments are generally applicable for tracing C++ programs, and only the way in which we insert those fragments is specific to the xlC compiler and the Program Database. Object Longevity. Two events related to object longevity are captured: creation and destruction of an object. For this purpose every instrumented class is equipped with a ``identity object'', _PEoid: class A { PE_Oid _PEoid; ... } For this instrumentation we use C++'s automatic construction and destruction of class members. The constructor of PE_Oid assigns a unique number to the object and calls Allocate in the Trace Recorder, whereas the PE_Oid's destructor calls Deallocate. The constructor and destructor functions of A are not suitable for capturing object creation and destruction, since such events actually take place respectively before and after these functions are called. Where should we put _PEoid in the case of inheritance? We put it into each instrumented class, even if these classes appear in the same inheritance path. To avoid allowing one object to have multiple identities, subclass _PEoids ``inherit'' their unique identity (an integer value) from the base class. In the case of multiple inheritance this mechanism works the opposite way. class B : public A { PE_Oid _PEoid(A::_PEoid.oid()); ... } Why not use the value of this as a unique id of an object? This would not work, for several reasons. First, if multiple inheritance, possibly with virtual base classes, is involved, different parts of an object (depending on which class in the inheritance path that part is representing) will return different values of this. Second, when an object is deallocated and a new object is created in the same space, it would be difficult to maintain a unique trace as object ids would no longer be unique over time. Function Invocation. Function invocations are captured with a mechanism identical to the one described for object longevity. _PEtmp is a local class variable that is automatically constructed when the function is invoked, and destructed upon its return: A::F() { PE_Func _PEtmp(_PEoid.oid(), ClassID, FunctionID); ... } The constructor of PE_func calls Enter in the Trace Recorder, and the destructor calls leave. One of the benefits of this approach is that only the callee is instrumented, and since one function may have many callers, it more efficient than instrumenting call locations. The C++ compiler silently generates a number of functions if they are not explicitly defined by the programmer. Such functions are the constructor, destructor, copy-constructor, and assignment operator. For more details on this subject, consult the C++ ARM [Ellis90]. Regardless of the implicitness of these functions, they often play an important role in the execution of an O-O program and thus cannot be ignored. The copy-constructor creates new objects, and along with the assignment operator it copies the state of one object to another. No source code exists for compiler-generated functions, so we need to make these functions explicit. Below is an example of a constructor and destructor. Notice that PE_Constr and PE_Destr are used instead of PE_Func to distinguish these three types of function invocation. The constructor of PE_Constr calls Construct in the Trace Recorder and PE_Destr calls Destruct. The constructor instrumentation is as follows: A::A() { PE_Constr _PEtmp(_PEoid.oid(), ClassID, FunctionID); ... } and the destructor instrumentation is as follows: A::~A() { PE_Destr _PEtmp(_PEoid.oid(), ClassID, FunctionID); ... } The copy-constructor and the assignment operator are more complicated to create than the above constructor and destructor. In both cases it is necessary to create separate initialization lists for scalar member variables and assignment routines for array members. Below is an example of an explicit copy-constructor. Notice that _PEoid() in the initializer list ensures that the copied object receives a new identity different from the originator (rhs). A::A(A& rhs) : _PEoid(), r(rhs.r), s(rhs.s),... { PE_Constr _PEtmp(_PEoid.oid(), ClassID, FunctionID); Array member assignment } Free functions (non-member functions) are also instrumented. In their case the class identity and object identity are set to zero. Table 3: Schema for Dynamic Information. Object longevity: create(SrcObjID, SrcFunID, TgtObjID, ClassID, Time) destroy(SrcObjID, SrcFunID, TgtObjID, Time) Interactions: invoke(SrcObjID, SrcFunID, TgtObjID, TgtFunID, Time) access(SrcObjID, SrcFunID, TgtObjID, VariableID, Time) State: value(VariableID, Value, Time) Variable Access. Variable access is more difficult to capture than function invocations. The approach we have taken is to ``functionalize'' variable access. That is, each definition of a member variable is attributed by an access function to be used instead of direct variable access. The original member definition B* b; is supplemented by two functions. The first serves to capture variable access: B*& b_PE() { PE_usage(_PEoid.oid(), VariableID, A::b_PEvalue); return b; } while the second is used to retrieve the value of the variable: static void* b_PEvalue(void* o) { return (void*)((A*)o)->b; } The access function (PE_usage(...)) notifies the Trace Recorder about variable uses. It also forwards a pointer to a function able to return the value of the variable. The reason the Trace Recorder does not receive the value directly is that variable assignment first takes place after the access function has returned. Member variable access is modified by adding the function-suffix _PE(). This suffix works for both reading a variable: b->foo(); becomes b_PE()->foo(); and writing to a variable: b = new B; becomes b_PE() = new B; Table 4: External Protocol for the Trace Recorder. Command Input Output Execution exit Control run Focus Record step Focus Record callStep Focus Record returnStep Focus Record constructionStep Focus Record usageStep Focus Record setBreakPoint Focus Record Trace invocRecording Recording usageRecording Query about Result Processing getInvocations Focus Record Result getConstructions Focus Record Result getUsages Focus Record Result getPointers Focus Record Result 5. Trace Recording and Execution Control When an instrumented program has been compiled and linked with the Trace Recorder, it is ready to be executed by Program Explorer. Below we describe trace recording and how it is controlled and used by Program Explorer. Trace Recording. The atomic events of a running program are captured by the Trace Recorder and transformed into a sequence of binary relations. The event in which an object is created is transformed into a relation that specifies the identity of the object being created as well as that of its creator. Function invocations are captured as a sequence of Enter- and Leave-events that can easily be converted into the corresponding series of binary call relationships. Binary relations are more convenient than raw events with regard to storage management and query processing. The schema for representing these relationships is given in Table 3. Since the Trace Recorder stores and manages the trace, Program Explorer has to pose queries to the Trace Recorder in order to produce visualizations of dynamic information. The query interface to the Trace Recorder is a part of its external protocol given in Table 4. Unlike the Program Database, the Trace Recorder only provides a fixed number of query functions. This restriction has been caused by a requirement of low response times and space-efficient storage of large traces. Still, the query mechanism is very flexible, since each query can take a Focus Record as argument. A Focus Record specifies any meaningful selection of a static or dynamic program entity or relationship, such as class, object, function, or invocation. Queries take the form of someQuery(ClassID, SrcObjID, TgtObjID, FuncID/VarID) and return lists of Result(SrcClassID, SrcObjID, TgtClassID, TgtObjID, FuncID/VarID) Notice that, since no text information is exchanged between Program Explorer and the Trace Recorder, the Program Database acts as a name server. Using unique class, function, and variable identifiers gives a very compact trace, lowers the communication overhead, and allows the system to distinguish overloaded names. Execution Control. The instrumented program is executed under the control of Program Explorer. That is, Program Explorer uses the control interface of the Trace Recorder's external protocol to run the instrumented program or to execute it in a single-step mode. This mechanism is well known from debuggers and very suitable for localization (finding a spot of particular interest). Whenever execution in the instrumented program is halted (that is, when the program exits, or reaches a breakpoint, or when a signal from the operating system is received), a Focus Record is returned to Program Explorer, allowing it to retrieve information about the events that led to that halt. Unlike in debuggers, this information includes full details of all the recorded events up to the halt, and not only the contents of the call stack. The Trace Recorder also allows selective trace recording. Both invocation and variable usage recording can be switched on and off independently, thus limiting the generation of trace information. This technique and the selective instrumentation mentioned in Section 4 are the two means of reducing trace generation. Selective instrumentation is limited to compile-time. In the future we would like to extend this mechanism to run-time, so that selective trace recording is supported by two orthogonal concepts: program-entity-based and breakpoint-based selection. Table 5: Trace Recording Statistics. Application Program himom flyweight preview idemo doc LOC 29 31 197 635 15,210 Classes 42 53 84 112 85 Functions 263 365 514 731 564 Objects 152 271 9,893 9,437 17,404 Invocations 603 1,146 32,327 33,291 64,553 Statistics. Table 5 shows statistics from the trace recording of an instrumented version of the Interviews library. For some common Interviews sample and application programs we show the number of different class and function definitions involved in a particular execution, and the number of actual object creations and function invocations. In these examples, no attempts were made to limit trace generation. An important observation that can be drawn from this table, is that, while the amount of dynamic information grows rapidly in proportion to the size and complexity of the application program, the size of the static information space remains almost constant. This observation supports our approach of using static information to leverage dynamic information. 6. Related Work Two query-based program visualization tools - CIA++ [Grass92], developed at AT&T Bell Laboratories, and GraphLog [Consens93], developed at the University of Toronto - focus on the static properties of O-O systems. CIA++ builds a relational database of information extracted from C++ programs. The database serves as a foundation for static analysis tools for displaying various views of the program structure. GraphLog is a visual tool for databases. Queries are posed by drawing graph patterns with a graphical editor. GraphLog has been used for visualizing and querying software structures [Consens92]. Since both tools are limited to static program information, their visualizations are of limited interest unless very interesting queries are posed. In our experience, however, writing such queries is often difficult and distracts the user's attention from the original goal of understanding a program. Object Visualizer [Pauw93,Pauw94] and HotWire [Laffra94], both developed at the IBM T. J. Watson Research Center, are dynamic O-O program analyzers that primarily rely on visual effects to draw attention to program anomalies rather than giving exact information. Both tools are based on the same program instrumentation mechanism for gathering execution information. This mechanism is seemingly less accurate than Program Explorer's, and does not generate information on implicit functions, variable usages, and variable values. The execution model of Object Visualizer [Pauw94] is accumulative, and is intended for O-O profiling uses such as finding ``hot spots'' in classes and objects. Another capability of this tool is visualizing ``memory leaks,'' that is, objects that are not deleted after use. HotWire is a visual debugger for C++ that allows the user to write custom visualizations in a scripting language. While this scripting mechanism seems to be very useful for algorithm and object animation, we fear that it distracts the user's attention from program debugging (which often is performed under strong time pressure). HotWire resembles Program Explorer more closely than does Object Visualizer. HotWire and Program Explorer support microscopic views into the state and behavior of individual objects, whereas Object Visualizer focuses on the overall picture characteristic of program profiling tools. To our knowledge, the systems described in the above have been mainly applied to programs written in C++. However, visualizers have also been constructed for other O-O languages such as Smalltalk (e.g., message diagraming [Cunningham86], the Trick system [Boecker90], and Portia [Gold91]), and LISP dialects (e.g., GraphTrace [Kleyn88]). A common feature of these systems is that they benefit from the openness of interpreted O-O languages. Objects actually exist at runtime in these systems, whereas runtime structures in C++ are flat and not very O-O. 7. Conclusion Scalability has been the major issue in the design and implementation of Program Explorer. The issue is complex, since it concerns human as well as computer resources. However, we have in our research found a common denominator for addressing this issue: static-dynamic information coupling. Let us take the computer resources first. With a practically complete instrumentation utility, Program Explorer has been successfully used to instrument and generate trace information for large O-O systems such as Stanford's Interviews library and some of Taligent's CommonPoint frameworks [Myers95]. To reduce the amount of generated dynamic information, we rely on the coupling of static and dynamic information to perform selective class-wise instrumentation combined with the use of breakpoints to switch execution tracing on and off. Human resources, on the other hand, are primarily related to the reduction of the cognitive load. For this, we also rely on the coupling of static and dynamic information to produce visualizations that combine dynamic properties with static properties and vice versa. Such filtered views allow the programmer to focus on certain aspects of system behavior while ignoring others. Finally, the GUI of Program Explorer relies on static-dynamic information coupling to produce hypertext-style integration between the different visualizations. At the time of writing, three different prototypes of Program Explorer have been developed. In addition to the one described in this paper, we have developed a version based on debugging technology, which makes both program instrumentation and program database obsolete. The advantage of this system is a shorter edit-compile-explore turn-around time for the developers. However, the price paid is a greatly increased execution time. The third version, for IBM's System Object Model (SOM) [IBM93], replaces program instrumentation with an extension to the SOM metaclass that captures method invocations [Forman94] and uses the SOM repository for static information. Currently, we are investigating the concept of O-O Design Patterns [Gamma94] with the goal of automating the processes of searching for and visualizing recurring designs in O-O systems. Viewing an O-O system from the perspective of design patterns often makes the detailed design more comprehensible. Our initial experience is that the static-dynamic coupling mechanisms described in this paper are very useful for pattern analysis [Lange95b]. The reason for this is that design patterns very often rely on ``abstract behavior'' defined in abstract classes but realized in concrete classes. We have found the vertical and horizontal selection technique described in Section 2 to be particularly useful for showing design patterns. If we can formally express the semantics of design patterns, we have the basic means for realizing automated search and visualization. Even though Program Explorer is not intended for debugging, it demonstrates a clear potential for visual debugging. Its class- and object-centered visual representations of static and dynamic program information are an ideal communication medium for programmers. Moreover, its ability to keep a history of function invocations, as well as accesses to variables and the values of those variables, suggest the possibility of a more efficient debugging process, where programmers are able to investigate the events that lead to a run-time error instead of just being told where the error occurred. Acknowledgements We wish to thank our many colleagues at the IBM Tokyo Research Laboratory for their contributions to the Program Explorer project, and in particular Dr. T. Kamimura for his unflagging support. We are also grateful to R. Thornton and R. Pfeiffer of Taligent for their kind assistance in testing Program Explorer on Taligent's CommonPoint frameworks, and to M. McDonald of IBM Japan for checking the wording of this paper. References H. Bocker and J. Herczeg. What Tracers Are Made of. In OOPSLA '90, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 89--99, 1990. M. Consens, A. Mendelzon, and A. Ryman. Visualizing and Querying Software Structures. In Proceedings of the 14th International Conference on Software Engineering, pages 138--156, 1992. M. Consens and A. Mendelzon. Hy+: A Hygraph-based Query and Visualization System. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD Record, 22(2), pages 511--516, 1993. W. Cunningham and K. Beck. A Diagram for Object-Oriented Programs. In OOPSLA '86, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 361--367, 1986. M. A. Ellis and B. Stroustrup. The Annotated C++ Reference Manual. Addison-Wesley, 1990. I. R. Forman, S. Danforth, and H. Madduri. Composition of Before/After Metaclasses in SOM. In OOPSLA '94, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 427--439, 1994. E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Object-Oriented Software Architecture. Addison-Wesley, 1994. E. Gold and M. B. Rosson. Portia: An Instance-Centered Environment for Smalltalk. In OOPSLA '91, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 62--74, 1991. J. E. Grass. Object-Oriented Design Archaeology with CIA++. Computing Systems, 5(1), pages 5--67, 1992. IBM. SOMObjects Developer ToolKit, Users Guide. IBM Corp., 1993. M. F. Kleyn and P. C. Gingrich. GraphTrace - Understanding Object-Oriented Systems Using Concurrently Animated Views. In OOPSLA '88, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 191--205, 1988. C. Laffra and A. Malhotra. HotWire -- A Visual Debugger for C++. In Proceedings of USENIX C++ Technical Conference. USENIX Association, pages 109--122. D. B. Lange and Y. Nakamura. Interactive Visualization of Design Patterns Can Help in Framework Understanding. To appear in OOPSLA '95, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, 1995. M. A. Linton, P. .R. Calder, J. A. Interrante, S. Tank, and J. M. Vlissides. InterViews Reference Manual Version 3.1. The Board of Trustees of the Leland Stanford Junior University, 1992. W. Myers. Taligent's CommonPoint: The Promise of Objects. IEEE Computer, 28(3), pages 78--83, 1995. G. M. Nielson, B. D. Shriver, and J. Rosenblum. Visualization in Scientific Computing. IEEE Computer Society Press, 1990. W. De Pauw, R. Helm, D. Kimelman, and J. Vlissides. Visualizing the Behavior of Object-Oriented Systems. In OOPSLA '93, Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 326--337, 1993. W. De Pauw, D. Kimelman, and J. Vlissides. Modeling Object-Oriented Program Execution. In Proceedings of the 8th European Conference, ECOOP '94. Lecture Notes in Computer Science, Vol. 821, pages 163--182, 1994. A. The Flyweight Sample Program The program below starts with the creation of a session, a widget kit, and a layout kit. The widget kit is used to create a text label. A transformer is created and set to a 90 degrees rotation. In the session window, the label appears three times: normally, transformed (90 degrees rotation), and with a shadow background (see Figure 7. int main() { Session* session = new Session(); WidgetKit& kit = *WidgetKit::instance(); LayoutKit& layout = *LayoutKit::instance(); Glyph* label = kit.label("HELLO"); Transformer t; t.rotate(90.0); session->run_window( new ApplicationWindow( new Background( layout.hbox( label, new TransformSetter(label, t), new Shadow(label, 0, 0, new Color(0.7, 0.7, 0.7, 1.0) ) ), kit.background() ) ) ); } --- Figure 7: The Flyweight GUI. --------------------end of paper-------------------------