"C++ Design and Implementation Challenges in Technology Computer Aided Design Frameworks" Goodwin R. Chin (gchin@watson.ibm.com)[y], Dharini Sitaraman[z], Chung Yang[z], and Martin D. Giles (madagil@dip.eecs.umich.edu)[z] [y] IBM T. J. Watson Research Center, Yorktown Heights, NY 10598. [z] Solid State Electronics Laboratory, University of Michigan, Ann Arbor, MI 48109. 1 Introduction ----------------- Interoperability of physical simulation tools is often cumbersome due to dif- ferences in domain representation. The Technology Computer Aided Design (TCAD) community (1) has realized this problem and is working on providing an object-oriented interface for representing and manipulating the state of a semiconductor wafer during process and device simulation, the Semiconduc- tor Wafer Representation (SWR) [1, 2, 3]. We describe an implementation of components of a 3D SWR designed to support the needs of many TCAD simulators. A set of classes provide support for solid modeling operations, mesh refinement operations, and data interchange. A layered memory sub- system allows either uniprocess or client/server operation. Run-time type identification provides dynamic extensibility of data types. The TCAD world is generally divided into process modeling (simulat- ing the fabrication of semiconductor device structures) and device model- ing (simulating the electrical characteristics of these devices). Furthermore, process modeling tools can be partitioned into topography modeling tools which change the physical shape of the device and bulk processing tools which change the material and electrical properties of the device. Examples of topography processes include lithography (pattern transfer of the inte- grated circuit onto the semiconductor), deposition of electrically insulating and conducting materials, and etching (removal) of these materials. Bulk processing steps include the implantation of chemical species into the bulk ______________________________ 1) TCAD is a subset of physical simulation that primarily focuses on the simulation of the fabrication and electrical testing of semiconductors. substrate material and the diffusion of these species in the substrate to alter electrical characteristics. Steps that involve both topography and bulk pro- cessing include thermal oxidation and silicidation - steps common in the fabrication of modern day devices. Frameworks facilitate the development and use of modeling tools by pro- viding infrastructure and services [4]. Many of these framework services, such as user interface management and intertool communication, are domain- independent in that their value to a framework is independent of the partic- ular application (e.g. electrical CAD or mechanical CAD). Services such as design data representation are domain-dependent in that they are strongly coupled to the application area to be supported. Development of frame- work standards is being promoted by the CAD Framework Initiative (CFI), a consortium of end users and vendors worldwide, dedicated to defining in- terface standards for the infrastructure of design automation tools and for design data used by these tools. When vendors adhere to these standards, interoperability between a heterogeneous set of design automation tools is ensured. The design data representation requirements for TCAD can be supported by using two representation components for wafer state - a geometry to represent the shape of the wafer and a field to represent the values of a particular solution variable at various points in the domain. Figure 1 shows a 2-D cross section of a wafer with the geometry and field portions denoted. Field values are usually associated with a mesh across the domain that is used to solve partial differential equations that describe the physical processes that are simulated. Section 2 describes an implementation strategy to support both geom- etry and mesh views of the domain. These representations pose different storage/functionality requirements and a need exists to allow easy access to both representations for the same domain. Implementation of classes to sup- port meshing algorithms is discussed with analysis of the benefits of using C++ features such as templates and multiple inheritance with virtual base classes. Section 3 describes implementation of a new flexible scheme to support field values on generic meshes with multiple interpolation functions. In par- ticular, we present a run-time type library with automatic updating of virtual Figure 1: SWR Geometry and Fields table pointers to allow sharing of C++ objects across multiple executables. This is an extension of the run-time typing necessary to express user-defined fields. 2 Cells ---------- A Cell is a geometric object that can be used to describe the shape of the simulation domain (Figure 1). Cells can be simple (e.g. rectangle) or complex (multi-faceted polyhedral). A Cell Complex is formed from a set of cells with a common property (e.g. same material). Cell Complexes facilitate implementation of efficient material boundary queries. Topography-based simulators typically operate in the following fashion: o extract exposed top surface of the wafer (set of edges for a 2D wafer and a set of faces for a 3D wafer). o using models, possibly physics-based, determine the new position of the surface. This may require information from the volumes below the surface. o update the original wafer with the new surface. This may require adding new cells or modifying the boundaries of existing cells as vol- umes may be modified. Field values that are topography-related may also need to be updated. Use of solid modeling boolean operations such as subtract and glue (Figure 2) provide robust update of the original wafer. To support efficient boolean operations, rich (in memory usage and functionality) data structures such as the "staredge" [5] are required. Fortunately many domains can be described using relatively few cells. Figure 2: Set Operations Field-based simulators often use meshes to describe solution values on the domain and to discretize partial differential equations used to describe physical models (Figure 1). Cells can also be used to describe the elements within a mesh. As the number of elements in a mesh is typically orders of magnitude greater than the number of cells necessary to describe the shape of the domain, a compact data structure is required to minimize stor- age requirements. The minimal mesh representation adequate for efficient element-based matrix-assembly, a common operation for simulators, is an enumeration of all of the vertices (nodes) in the mesh, a list of elements, where each element contains an ordered list of nodes that define its bound- ary, and a list of neighboring elements. Table 1 shows the amount of storage necessary to store a tetrahedral mesh using a variety of data structures. No- tice the dramatic increase in storage necessary to provide efficient boolean operations using the "staredge" data structure. Even supporting higher-level functionality such as meshing results in a 6-fold increase of storage over a minimalistic representation. -------------------------------------------------------------- | Data Structure | Memory Usage per Tetrahedral | -------------------------------------------------------------- | Staredge | 1.1 kBytes | | | | | Meshing Cells | 450 Bytes | | | | | Minimal mesh description | 76 Bytes | -------------------------------------------------------------- Table 1: Memory usage for a variety of data structures on a typical 3D mesh (729 vertices, 4184 edges, 6528 faces, 3072 tetrahedron) Many meshed-based simulation tools may require another service, mesh refinement, as numerical considerations may require the refinement of an element into a set of smaller elements. To support both mesh refinement and to improve the computational efficiency of matrix assembly, a MeshingCell class is introduced in Section 2.1. Both the geometry cell and meshing cell classes are inherited from an interface base class as shown in Figure 3. EXPRESS-G notation[6] is used to describe relationships between classes (2). Classes are enclosed in rectangles, with lines indicating inheritance. Relationships are unidirectional with the circle indicating the "to" side of the relationship. Errors in converting beween the different representations can be minimized by having both types of cells obtain their vertices from a common pool. This ensures that corresponding vertices in the geometry and mesh representations correspond to exactly the same object rather than using numerical comparison to ensure that the vertices share the same location in space. Operations common to the base class include connectivity information such as location of the cells in space. These operations are sufficient to develop file-based translators to existing TCAD simulators. In addition, a minimal mesh description can be generated for minimal persistent storage. ______________________________ 2) In this paper only inheritance relationships will be shown. Containment relationships can be shown by using thin lines between classes. Figure 3: Cell Class Diagram in EXPRESS-G Notation 2.1 MeshingCells ------------------ To support the integration of existing meshers and to facilitate the imple- mentation of new meshing algorithms, a set of classes have been developed as shown in Figure 4. A MeshingCell contains information to describe its boundary, cells of higher dimension that contain it, and the Cell Complex that contains it (3). For example, edge "B" in Figure 5 would contain ref- erences to vertices "C" and "D" that describe its boundary, a reference to face "A" as a parent cell that contains it, and a reference to a cell complex. MeshingCells have two template arguments. The first argument is the topo- logical dimension of the cell (e.g. a vertex has topological dimension 0, an edge has topological dimension 1, a face has topological dimension 2) and the second argument describes the dimension of the space the cell is embed- ded in. For example, an edge in a volume (3-space) would be denoted as MeshingCell<1,3>. A MeshingCell class might be declared as follows. template class MeshingCell { private: Set > boundary; Set > parents; public: MeshingCell(Set >&); Iterator > boundary() const; }; A ShadowMeshingCell is similar to a MeshingCell but adds additional references to interior subcells to support intermediate steps associated with mesh refinement. Consider a simple face refinement algorithm (Figure 6): ______________________________ 3) This data structure is very similar to the "staredge" without explicit enumeration of "neighborhood" information (e.g. given a vertex in a particular Cell Complex, enumerate the faces which contain the vertex). Figure 4: Meshing Classes in EXPRESS-G Notation Figure 5: Cell Composition o add vertex in the center of the face (step 1). o attach edges from the corners to the center point (steps 2 - 5). The addition of the vertex forms an invalid cell as the point is neither on the inner or outer boundary. However use of the ShadowMeshingCell allows the temporary association of the point with the current face. Addition of the first edge again results in an invalid cell (4) - using the ShadowMeshingCell the edge can be associated with the face. Addition of the second edge causes the creation of two valid cells. Many meshing algorithms are independent of spatial dimension (i.e. an algorithm for refining a plane should be the same whether the plane is em- bedded in 2-D or 3-D). However implementation of 2-D cell meshers usually assume that the cell is also embedded in 2-D - and the mesher will not be usable to mesh the surface of a volume. One way to provide surface meshing capability using an existing 2-D mesher is the following. ______________________________ 4) we do not allow manifold cells in our representation. Figure 6: Refining a face find a transformation that maps the face into the x-y plane create a MeshingShadowCell<2,2> for the face map edges of the face to MeshingShadowCell<1,2> objects map vertices of the face to MeshingShadowCell<0,2> objects. this will require using the transformation on Point<3> objects. based on connectivity of the face, clone the connectivity hierarchy to the MeshingShadowCells. extract information from the MeshingShadowCell<2,2> and send it to the existing mesher. The MeshingTopologicalCell base class provides connectivity information in a spatially independent fashion such as a list of edges that comprise the boundary of a face. This information can be used to clone the structure of the face in the above algorithm. 2.2 Judicious Use of Virtual Base Classes ------------------------------------------- As discussed in Section 2.1, MeshingShadowCells can be used as reduced spa- tial dimension representations of MeshingCells. This relationship is main- tained using a name matching scheme. During construction of the Meshing- ShadowCell, name matching between MeshingTopologicalCells and Meshing- ShadowCells of lower dimension is necessary as shown in the algorithm for spatial dimensional reduction of an edge of a face: for (each vertex in the shadowed face) if (name of vertex == name of origin of edge) origin of shadowed edge = current vertex of face else if (name of vertex == name of destination of edge) destination of shadowed edge = current vertex of face Two implementations for naming were investigated and class diagrams are shown in Figures 4 and 7 : o Using virtual base classes to provide automatic name comparison. Classes MeshingTopologicalCell, MeshingCell, and MeshingShadowCell are all derived from the base class NamedBase. o Explicitly providing comparison operators between MeshingTopologi- calCell and MeshingShadowCell that work on name comparisons. This scheme does not require the use of virtual base classes. Figure 7: Class Diagram Using No Virtual Base Classes The benefits of the virtual base approach are as follows: o Virtual base classes make it clear that comparison between Meshing- TopologicalCell and MeshingShadowCell objects is implemented as com- parison between objects of the Name class and not due to some other criterion (e.g. same address). o Virtual base classes reduce the amount of coding necessary to add an- other class which would need to use name comparison with either the MeshingShadowCell or MeshingTopologicalCell classes. On the other hand, there is usually a significant overhead necessary to sup- port virtual bases because class-dependent offsets must be computed (5). This problem is discussed in more detail by Nackman and Barton[7]. As cell shadowing occurs infrequently compared to other more time consuming op- erations that actually refine the cells, use of virtual base classes does not pose a large penalty in overall performance. ______________________________ 5) Of course, the amount of overhead is implementation dependent. 2.3 Use of Templates ---------------------- Sometimes classes that seem perfect for templatization from a conceptual point of view become less attractive during implementation. Consider a point in space. Points could be templatized by spatial dimension and would possess common methods such as components of their coordinates and distances from other points of the same spatial dimension. A templatized class declaration might look like the following: template class Point { public: Point(double* coordinateArray); double distance(const Point&) const; double[spatialDim] components() const; private: double components[spatialDim]; }; While extendible from 1D to 3D, the form of the constructor and com- ponents method are awkward as the user would be forced to place the coor- dinates into an array rather than using a more friendly constructor such as Point<2>::Point(double x, double y). As a friendly programming in- terface is more important than the number lines of code manually typed, the Point classes were implemented using template specialization (e.g. Point<1>, Point<2>, and Point<3> were defined and declared separately). Another possible approach is to implement the Point class using inheri- tance. A base class containing the distance and components methods could be used with the corresponding derived class responsible for construction and data storage. This approach was rejected due efficiency and storage consid- erations associated with the use of the virtual table. Use of a virtual pointer requires the use of an extra word of storage for a Point _ which can be a large percentage of the total space allocated for each point (2 words are nec- essary to store a 2D point and 3 words are necessary to store a 3D point). For compilers that cannot inline virtual functions, the overhead of calling non- inline versions of virtual methods of simple methods such as components() can be tremendous (100% time overhead on a RS/6000). The MeshingCell classes exhibit similar behavior for constructors. For ex- ample, a vertex is defined by a point and similarly an edge (MeshingCell<1,3>) is defined by a set of vertices. In addition only certain operations ap- ply to particular objects. For example the distance operation only makes sense between two vertices. Users also complained about remembering that MeshingCell<1,3> was a edge in 3D. Using typedefs to translate the topo- logical template argument into a more recognizable name help alleviate the above problem. Hence, conceptual templatization can break down in imple- mentation. 3 Fields ----------- A field is a mapping from a domain into a range set. For example, a topo- graphic map displays a field which has as its domain a particular area of the country and as its range set real numbers corresponding to elevation above sea level. In the TCAD world, the field domain is usually the physical space occupied by the semiconductor wafer. The field range set depends on the physical quantity involved, and could be a real number representing atomic concentration, a real vector representing current flow, a complex number representing refractive index, or some larger combination of real and inte- ger components. Most simulation programs use discrete approximations for fields, subdividing the domain into small, regularly shaped elements which comprise a mesh, storing range values at a limited number of locations within the mesh, and using an interpolation rule to estimate the field value at other locations across the domain. The goal of the Fields portion of SWR is to pro- vide objects and methods to efficiently represent and manipulate mesh-based field descriptions for arbitrary range sets. 3.1 Representing Fields ------------------------- An overview of the main field object and message classes is given in Fig. 8, where the arrows indicate messages passing between objects in our system. Field representation is divided into four levels to allow maximum sharing of information between related fields. At the bottom level are individual mesh components such as nodes and elements. A collection of mesh components filling the domain comprise a mesh. All of these objects are parameterized by spatial dimension only using the template mechanism. The Shape Family Figure 8: Objects and messages for field representation level adds information about the storage and interpolation scheme to be used to represent the field values across the domain. This information is specific to the mesh spatial dimension but independent of the value type. At the top level, FieldOnMesh adds the range value type information to complete the field description. The components of a FieldOnMesh are connected by reference rather than by inheritance so that fields may share common in- formation. For example, a single mesh can be used with different Shape Family interpolation rules, and a single Shape Family with different range value types. This sharing is essential for efficient use of memory since, in practice, most fields across a particular domain will use the same underlying mesh. Messaging classes are used to pass requests and responses between the levels of the field object hierarchy. There is both good news and bad news in matching the requirements for fields against the capabilities offered by C++. On the positive side, tem- plate classes make it relatively simple to allow the user to work with fields of arbitrary range type. Our implementation of FieldOnMesh is parameter- ized by a range class and contains no type-specific information. Particular Shape Families require support for some algebraic operations from the range class so that they can perform interpolations, but if no interpolation is re- quired then the range class has few restrictions. On the negative side, the strong type checking provided by C++ must be sidestepped in two ways. First, range class values are stored with the components of the underlying mesh. At that level there is no value type information available and the mesh components must be prepared to accept whatever mixture of value classes is supplied by the fields that reference them. In practice this does not pose a problem because the user always manipulates field value information through the FieldOnMesh class so value type information can always be recovered. Second, the collection of all fields across a domain form a heterogeneous set. Domain field queries can return a FieldOnMesh base class, but runtime type identification (RTTI) is necessary to recover the range-specific type. The set of possible range types is determined by the client applications and must be dynamically extensible so new range types can be added without rebuild- ing the Framework code. Our solution has been to develop a generic RTTI library supporting the operations required for Fields. The design is compli- cated by the distributed nature of the Framework, which requires the sharing of RTTI information between client processes. 3.2 Runtime Type Identification --------------------------------- Mechanisms for RTTI have been implemented in several C++ libraries (for example [8]), proposed as an extension to the C++ language[9], and finally adopted as an official language extension[10]. The RTTI scheme used here was designed based on the ideas presented by Stroustrup[11] before the lan- guage extension proposal was finalized. Our goal was to develop a separate RTTI library which could be applied to several application class hierarchies, which minimized the intrusion into the application code, and which could operate with other Framework components for storage management. A sec- ondary goal was the ability to migrate to native language RTTI support when this eventually becomes available. Our primary application-level functions are packaged as macros to allow this without requiring application source code changes. In our Framework application, RTTI is used for all of the field classes parameterized by value type. 3.2.1 RTTI Interface --------------------- Figure 9: A simple application class hierarchy RTTI adds two practical capabilities to objects: the ability to downcast from a base class to a derived class in a type-checked way, and the ability to query the type of an object given a base class pointer or reference. Suppose we have a tree of application classes to which we wish to add this capability, such as is illustrated in Fig. 9. Adding our RTTI library requires two changes to the class header file and one change to the class implementation file. First, the class tree must contain the class SwrTypeBase as a base class: class Base : public SwrTypeBase { public: ....... }; Second, any descendant of Base that you wish to cast into must register its type. It does this by calling a macro in the public portion of its class member declaration, passing its own class name as the argument. class Derived : public Base { public: TYPEDERIVEDH(Derived) .......... }; If you wish to be able to cast into the middle of a class hierarchy, each derived class must also specify its parent typed class in the hierarchy. It does this using a second form of the registration macro. class Grandson : public Derived { public: TYPEDERIVED2H(Grandson, Derived) ...... }; Finally, the class implementation file must contain the corresponding im- plementation macros TYPEDERIVEDC(Derived) TYPEDERIVED2C(Grandson, Derived) Although macros conveniently hide the details of the implementation, the current language specification does not allow them to work smoothly with template class names. If the derived class is a template class with multiple arguments, the comma in the class name confuses the preprocessor because it is taken as an argument separator. The RTTI library defines a preprocessor symbol COMMA to work around this. Template arguments are required to appear in two different forms (with and without the argument-declaration portion) so both must be supplied to the macro. The corresponding header and implementation file fragments are: template class Person : public Base { public: TYPEDERIVEDH(Person) ........ }; TYPEDERIVEDCT(Person, class D COMMA class E, D COMMA E) At runtime, pointers and references to base classes can be cast into a derived class. Assuming br is a Base reference to a Grandson class object, we can recover the Grandson class object with Grandson& gsr = REFERENCE_CAST(Grandson, br); or get a pointer with Grandson* gsp = POINTER_CAST(Grandson, &br); We can also convert to an intermediate class in the hierarchy: Derived* dp = POINTER_CAST(Derived, &br); All of these calls generate a runtime error if the conversion is invalid (such as when br refers to a Granddaughter object). Alternatively, the call Grandson* dp = POINTER_CAST_NOCHECK(Grandson, &br); will return a null pointer is the cast is invalid. Const versions of each of these macros are provided. The casting macros provided by our RTTI library provide an equivalent of the capabilities in the C++ language extension[10, 12]. For example, the reference casting macro could be defined using the extension as #define REFERENCE_CAST(a,b) dynamic_cast(b) and similarly for each of the other macros. The second capability provided by RTTI is to allow queries of the type of an object given a base class pointer or reference. The primary purpose here is to allow comparison against a known type id. We provide this functionality using type comparison macros, such as if (POINTER_TYPE_MATCH(Grandson, &br)) .... if (POINTER_TYPE_MATCH(Derived, &br)) .... if (REFERENCE_TYPE_MATCH(Grandson, br)) .... In each case, the macro returns true only if a conversion is possible to the specified type. In our example above, each of the tests returns true because br is a Base reference to a Grandson object. This behavior does not match the type comparison scheme adopted in the language extension, which would return false for the both of the following tests if (typeid(Derived*) == typeid(&br) || typeid(Derived) == typeid(br) ) .... but true for both of if (typeid(Base*) == typeid(&br) && typeid(Grandson) == typeid(br) ) .... The language extension compares strictly on the basis of the value of the supplied expression and requires an exact match rather than a possible conversion path. The extension also distinguishes between polymorphic and non-polymorphic types and defines typeid information even for built-in types, which is beyond the capabilities of our library. Nevertheless, the limited form of type matching we provide has so far been sufficient for our application. 3.2.2 RTTI Implementation -------------------------- Our implementation of RTTI uses static objects to allocate and administer type information, taking care that initialization occurs exactly once for each RTTI class. The SwrType class holds the type information for a particular class: template class SwrType { private: static SwrTypeCode t; public: static const SwrTypeCode& code() { return t; } static int match(const SwrTypeCode t2) { return t.match(t2); } }; The constructor for SwrTypeCode ensures that each typecode has a unique value. Since SwrTypeCode is a static member of SwrType, exactly one unique typecode is allocated for each object type. The SwrTypeBase base class adds two virtual functions to the application hierarchy which are then overridden by the TYPEDERIVED macros for each derived class that is participating in the typing scheme. These functions use the static type member functions to answer queries about the current object type: class SwrTypeBase { public: virtual void* swr_type_match(const SwrTypeCode& ) const; virtual const SwrTypeCode& swr_type_code() const; }; #define TYPEDERIVED2C(T,P) \ void* T::swr_type_match(const SwrTypeCode& t) const \ { return SwrType< T >::match(t) ? (void*)this : P::swr_type_match(t); }\ const SwrTypeCode& T::swr_type_code() const \ { return SwrType< T >::code(); } The swr_type_match function recursively compares the requested type against the current and immediate base type until a match is found or the top of the hierarchy is reached. The result of a match is the address of the object when viewed as that particular type, or zero if no match is found. This bit pattern is cast into void* to maintain a consistent return type, from which a typed pointer can be later recovered without altering the bit pattern. At the top level, the dynamic type casting macros use the virtual functions to determine the corresponding cast object address. #define POINTERCASTNOCHECK(T,P) ( (T*)((P)->swr_type_match(SwrType::code()))) The overhead for adding our RTTI library to a set of applications classes is to force each participating class to be polymorphic. In the worst case, this may add storage for one virtual function table pointer for each level in the class hierarchy. Runtime overhead appears at the time a dynamic cast is made, adding one function call and comparison for each level examined in the class hierarchy. 3.3 Sharing Runtime Types --------------------------- Memory management for the Framework is provided by a library which can be configured in two modes. In the "local process" mode, new objects are al- located on the heap as usual. In the "shared memory" mode, pools of shared memory are managed between client processes to allow sharing of objects. We do not support simultaneous access of objects by multiple clients. In- stead, self-contained collections of objects describing the fields of a particular wafer state are passed from client to client. Figure 10: Components of a C++ library object Sharing between C++ programs is considerably more complicated than with C or FORTRAN. C++ objects conceptually contain both data and functions to operate on the data. Data values can be different for each instance of the class, but the functions are common to all instances of the class. This leads to implementations where the C++ object is divided, with the data stored on the stack or heap and the functions compiled into the executable for the program, as illustrated in Fig. 10. In particular, C++ objects with virtual functions contain at least one virtual function table (vtbl) pointer which is set to the address of the vtbl in the executable which created the object. An object which is identical in all user-defined fields may be represented by a different bit pattern if created by a different executable. This prevents na"ive sharing of objects by saving and restoring dumps of memory locations or, in this case, by sharing a common memory pool between processes. We have extended our RTTI scheme to allow objects to be "localized" to a particular executable by updating the process-specific information while leaving the other object fields untouched. This shared runtime type identi- fication (SRTTI) allows collections of objects to be passed between clients without converting to an intermediate exchange format or otherwise packing and unpacking the object contents. 3.3.1 SRTTI Interface ---------------------- Adding the capability to share objects requires two steps beyond the capa- bility provided by the memory management library to allocate object in a shared memory area. First, the runtime type information must be extended to include additional bookkeeping. Second, clients must invoke the object localization procedure when a collection of objects moves under its control. The change from plain RTTI to SRTTI does not require any source code changes to the application class descriptions. Turning on a preprocessor symbol SHAREDTYPES adds functionality to the existing TYPEDERIVED macros to include the infrastructure necessary to support type sharing. There are new constraints on the classes which are suitable for participation in SRTTI using the predefined macros: each base class of an SRTTI class must have a default constructor which does not modify the object state. This restriction can be removed if the user is willing to manually expand and modify the macro definition to add dummy base class construction arguments. In our implementation, we have hidden the object localization procedure inside the mechanism that attaches a client to a pool of shared memory ob- jects. We have assumed that each application will already have a mechanism in place to access every application-level object in the pool. At the time a new client attaches to an existing shared pool, it is required to call the virtual member function Fixme for every object in the pool. After this has been done, a final function call registers the current client as the object pool owner. No special action is required for subsequent object creation/deletion, nor when the client releases the memory pool. 3.3.2 SRTTI Implementation --------------------------- In our plain RTTI scheme, objects are identified by the SwrTypeCode value returned by their swr_type_code member function. The type code itself is not a data member of the object - the virtual function mechanism is sufficient to ensure that the corresponding swr_type_code member function is called. One consequence of this is that each object type must have a corresponding unique virtual function table address because it describes a function that is unique to each typed class. SRTTI operates by building a mapping between type codes and vtbl addresses so that the localization of objects can be performed. Note that an object may contain multiple vtbl pointers. The vtbl used by the SRTTI mechanism is the one corresponding to the SwrTypeBase class which is a base class of all participating classes. The SRTTI scheme operates in three phases. At static initialization time for a particular client, the SRTTI library builds a linked list of SwrTypeCode- vtbl pairs for all classes participating in SRTTI. This is accomplished by adding a second static member, swrlink, to the SwrType class with a con- structor which accumulates the pairs. The list is held by a common base class of all SwrType classes called SwrTypeShareBase. We have also added two member functions to SwrType. The Fixme method will later be used to localize the object to a new executable. The SwrTypeMyvtbl is a compiler- specific routine to obtain a reference to the vtbl for this type which will be used to both read and write the vtbl information. The version shown below is appropriate for our Sun and IBM RS/6000 compilers. template class SwrType : public SwrTypeShareBase { private: static SwrTypeCode t; static SwrTypeLink swrlink; public: static SwrTypeCode& code() { return t; } static int match(const SwrTypeCode t2) { return t.match(t2); } static void Fixme(T* t,sList *tlist); // this is compiler specific static int& SwrTypeMyvtbl(SwrTypeBase& r) { return *(int*)((void*)(&r)); } }; In order for the swrlink constructor to be able to call the SwrTypeMyvtbl method, it needs an object of type T to work with. The TYPEDERIVED macros add a dummy constructor to the application class to facilitate this, but it does constrain the possible application classes to those where a temporary copy of the object can be created and deleted by the SRTTI library. template SwrTypeLink::SwrTypeLink() { T t(swrTypeDummyObject); SwrTypeShareBase::SwrTypeAddtoList(SwrType::code(), SwrType::SwrTypeMyvtbl(t)); } The second phase of SRTTI occurs when a new client attaches to an existing shared memory pool. This pool contains the collection of application objects and the linked list of type-vtbl pairs from the originating client. The second client also has its own linked list of type-vtbl pairs with the same type codes but different vtbl addresses. The SwrTypeBase class and TYPEDERIVED macros have been extended to include a new virtual function swr_vtbl_fix which is able to localize all of the vtbl pointers for an object using the dummy constructor and placement new syntax: #define TYPEDERIVED2C(T,P) \ .... \ void T::swr_vtbl_fix() \ { (void) new ((void*)this) T(swrTypeDummyObject); } Initially, the second client is unable to call this virtual function because it too relies on a virtual function table pointer which still refers to the original client. When the application calls the Fixme method for a particular object, the (old) vtbl address from the object is used as a key to obtain the type code from the first client pair list. This type code is then used as a key to obtain the new vtbl address from the second client pairlist. After this new vtbl has been written into the object, the swr_vtbl_fix virtual function can be called to complete the localization process. template void SwrType::Fixme(T* t,sList *tlist) { int i = getTypeCode(t,tlist); // map from old vtbl to typecode int vptr = getVtblPtr(i); // map from typecode to new vtbl SwrTypeMyvtbl(*t) = vptr; // bootstrap: first fix one vtbl entry t->swr_vtbl_fix(); // use virtual function to fix the rest } After every object in the shared memory pool has been updated, the second client type-vtbl list replaces the original copy in shared memory so that it is available for use by a subsequent client. Figure 11: Sharing typed objects: before and after localization by B As an example, consider the case where executable A has created an instance of a Derived object in shared memory which is now required by executable B, as shown in Fig. 11. Shared memory also contains the A-type- vtbl pair list for executable A created by the SRTTI scheme. At startup, executable B builds its own B-type-vtbl pair list during static initialization with same type information but its own vtbl addresses. When control of the Derived object passes from A to B, executable B calls the Fixme method in Derived, supplying the SRTTI list from executable A. Fixme uses the old vtlb information from the Derived object to look up its type code in the A-type-vtbl list. It then uses the type code to look up the local vtbl address in the B-type-vtbl list and updates the value stored in the Derived object. At this point it is free to call any virtual function from the SwrTypeBase class. A call to swr_vtbl_fix converts any remaining vtbl pointers to their appropriate local values. SRTTI has an overhead in terms of both space and time. Space must be allocated in shared memory to hold the list of type-vtbl pairs which will have a size proportional to the number of classes participating in the SRTTI scheme. Runtime is consumed during the startup of each client as the local type-vtbl list is constructed, again proportional to the number of partici- pating classes. There is also a runtime cost whenever a client attaches to a shared memory object pool, proportional to the number of objects in the pool. These runtime costs are small compared to alternatives requiring the full object contents to be manipulated. 4 Conclusions ---------------- We have successfully implemented components of a 3D TCAD Framework using C++. Language features within C++ were essential for achieving our design goals for development of classes representing meshes, geometries, and fields. Fields require run-time type identification and classes that operate independent of range value type to allow mesh and shape information to be shared among many fields. This scheme can be extended to support the use of shared objects in a client/server architecture by providing the means to update virtual function tables when a new client comes into scope. Cells are used for both mesh and geometry representation and are reconciled through a base class that provides containment and simple neighborhood information suitable for interchange with existing simulators. To support spatially independent mesh refinement, a shadowing scheme is introduced which requires the use of name comparison. Two different implementations of name comparison trade off flexibility for performance. Some abstractions that lead naturally to template use in theory end up requiring template specialization to make the programming interface more user-friendly. The use of virtual functions for classes with a small number of member data and requiring many instances is very memory inefficient. While the use of C++ provided an extremely attractive mechanism for problem abstraction along with easily maintainable and debuggable code, some real-world problems still exist with current commercial compilers. Im- plementation bugs related to templates and specialization are common. Other compilers take an extremely long time to link an executable containing many template classes because multiple instantiations of identical templates are formed during the template instantiation phase. These practical roadblocks are a significant factor weighing against C++ compared to more mature languages, but are not sufficient to outweigh the benefits for our particular application. References ---------- [1] SWR Working Group of the CFI/TCAD TSC. "Semiconductor Wafer Representation Architecture, Version 1.0". CAD Framework Initiative, Austin, Texas, July 1992. [2] SWR Working Group of the CFI/TCAD TSC, editor. "Semiconductor Wafer Representation Procedural Interface, Version 1.0". CAD Frame- work Initiative, Austin, Texas, July 1992. [3] M. D. Giles, D. S. Boning, G. R. Chin, W. C. Dietrich, M. S. Karasick, M. E. Law, P. K. Mozumder, L. R. Nackman, V. T. Rajan, D. M. H. Walker, R. H. Wang, and A. S. Wong. "Semiconductor wafer representa- tion for TCAD". IEEE Trans. Computer-Aided Design, January 1994. [4] D. S. Harrison, A. R. Newton, R. L. Spickelmier, and T. J. Barnes. "Electronic CAD Frameworks". IEEE Proceedings, 78(2):393-417, Febru- ary 1990. [5] M. Karasick. "On the Representation and Manipulation of Rigid Solids." PhD thesis, McGill University, Montreal, Quebec, 1988. (available as Cornell Univ. Dept. of Computer Science 89-976, Ithaca, NY). [6] P. Spiby and D. A. Schenck. "Express language reference manual". ISO TC 184/SC4/WG5/P3, 1991. [7] Lee Nackman and John Barton. "Base class composition with multiple derivation and virtual bases". In Proceesings 1994 C++ USENIX Con- ference, April 1994. [8] K. E. Gorlen, S. M. Orlwo, and P. S. Plexico. "Data Abstraction and Object-Oriented Programming in C++". Wiley, 1990. [9] B. Stroustrup and D. Lenkov. "Run-time type identification for C++ (revised)". In Proceedings of the Usenix C++ Conference, August 1992. [10] B. Stroustrup and D. Lenkov. "Run-time type identification for C++ (revised yet again)". X3J16/92-0121=WG21/N0198, 1992. [11] B. Stroustrup. "The C++ Programming Language". Addison-Wesley, 2nd edition, 1991. [12] Josee Lajoie. "The new language extensions". C++ Report, 5(6):47-52, July 1993.