################################################
	   #                                              #
	   # ##   ## ###### ####### ##    ## ## ##     ## #
	   # ##   ## ##  ## ##      ###   ## ##  ##   ##  #
	   # ##   ## ##     ##      ####  ## ##   ## ##   #
	   # ##   ## ###### ######  ## ## ## ##    ###    #
	   # ##   ##     ## ##      ##  #### ##   ## ##   #
	   # ##   ## ##  ## ##      ##   ### ##  ##   ##  #
	   # ####### ###### ####### ##    ## ## ##     ## #
	   #                                              #
	   ################################################


	 The following paper was originally published in the
	   Proceedings of the Fourth Annual Tcl/Tk Workshop
		   Monterey, California, July 1996


	For more information about USENIX Association contact:

		   1. Phone:	510 528-8649
		   2. FAX:	510 548-5738
		   3. Email:	office@usenix.org
		   4. WWW URL:  https://www.usenix.org


Automated Wrapping of a C++ Class Library into Tcl

Ken Martin
General Electric Corporate Research and Development
1 Research Circle, Niskayuna, NY 12309
martink@crd.ge.com

Abstract

In this paper we describe an approach to wrapping a preexisting C++ class
library into the interpreted Tcl environment. Specifically, we look at our
efforts over the past two years to add the Tcl interpreted environment on top
of the Visualization Toolkit which consists of over three hundred C++
classes. We address how we overcame the fundamental issues involved in
wrapping existing C++ code and what limitations we had to accept. We contrast
our approach to other Tcl object oriented extensions such as Object Tcl and
[incr Tcl] and explain why they were not suitable. We conclude by looking at
future directions for our work.

1. Introduction

Over ten years ago researchers at General Electric started work on an
interpreted, object-oriented visualization system [2]. At the time, C++ was
still in its infancy and Smalltalk was considered one of the most promising
languages. Taking some of the best features of Smalltalk, a new language was
developed called LYMB. This provided an interpreter and script based language
very similar to Tcl with the exception that it was built on object-oriented
principles. While the system was very valuable, it had some weaknesses that
made it monolithic and difficult to integrate into other systems. In late
1993 an effort was started to create a new system that would overcome many of
the problems associated with its predecessor.

One thing we had learned was that object-oriented technology was a big
productivity enhancer for our group. We decided that any replacement system
should be object oriented. We also wanted the system to be portable,
extensible and easily integrated into a wide variety of applications. To this
end we selected C++ as our development language. About eighteen months into
the development of The Visualization Toolkit [5] (VTK) as it later was named,
we realized that using only C++ was proving to be a problem. In LYMB we had
an interpreted environment that made interactive development very quick and
easy. With VTK we had no interpreter or scripting language and its loss was
profoundly felt. At this point we started looking into adding an interpreter
into, or on top of our C++ class library.

Unlike LYMB which used its own interpreted language, we wanted to select a
language that was already established. We would focus on data visualization
which was our strength, and let others (such as the Tcl community) focus on
language issues. The decision came down to Tcl or Python. Tcl's wide
acceptance, maturity and easy extensibility won out in spite of the fact that
it didn't have any direct object-oriented support.

After some quick research into the Tcl community, we found that there wasn't
a fully automatic way to wrap C++ into Tcl without significantly modifying
the C++ code. Since we already had over 200 classes written, a serious change
in coding style was out of the question. At this point we decided to try
writing our own automated C++ code wrapper for Tcl, the result of which is
discussed in the remainder of this paper.

2. Related Work

When we started writing our C++ to Tcl wrapper generator there were no
suitable alternatives. Now, over two years later, much progress has been made
on this front and it is worth revisiting. [incr Tcl][3][4] has emerged as a
popular object oriented extension to Tcl providing objects, mega-widgets
namespaces and more. While this system does provide strong encapsulation and
integration with Tcl, it does not provide a strong interface to C++
classes. [incr Tcl] allows you to bind C functions or static C++ functions
into [incr Tcl] classes. It does not support wrapping of C++ classes,
construction, destruction of C++ objects, or invocation of C++ member
functions. Object Tcl[7] is similar to [incr Tcl] in that it is designed to
support object oriented programming for Tcl, not the incorporation of
existing C++ libraries.

The Simple Wrapper Interface Generator (SWIG) by Dave Beazley, is a tool for
that can generate wrapper code for a number of different target languages
including Tcl. SWIG requires an interface file that prototypes the functions
that are to be wrapped. For many C libraries the ANSI C prototypes can be
directly included into the interface file. For C++ code it isn't as
easy. SWIG provides a good wrapper for prototype functions, but it doesn't
provide much support for wrapping C++ classes. Creating, destroying,
type-conversion and passing of classes as arguments is not supported in the
current version. So while SWIG is an excellent package for wrapping C
libraries, it currently isn't suitable for wrapping C++ class libraries.

Objective-Tcl[1] does provide much of the functionality we required for
wrapping VTK. Since we had decided to use C++ as our compiled language and
Objective-Tcl binds to Objective-C, we were not able to use it. In some ways
Objective-Tcl is a better solution since it supports creating classes and
methods within the Tcl domain. Our solution provides access to the C++
objects, but it does not provide any method for object oriented programming
in the Tcl domain. The use of Objective-C's run time class information solves
some of the tricky problems of C++ object integration that we had to address.

Otcl[6] is the system that most resembles our solution. In many ways Otcl,
like Objective-Tcl, provides a better solution to wrapping existing classes
into Tcl. But there are some key differences between Otcl and our
approach. Otcl requires the user to specify a Class Description Language
(CDL) file for all the objects you want integrated into Tcl. In this file,
you must specify the public methods, their arguments and whether they require
static or dynamic binding. In our approach the required information is
extracted from the C++ header files.

Otcl currently doesn't handle type conversions between superclasses and
subclasses which can lead to problems, especially in libraries that make use
of multiple inheritance. Early in our development we ran into this problem
which prompted our efforts to perform true type conversion for C++
objects. To support VTK's hardware independence, instances are created on the
C++ side which must be returned and used on the Tcl side. Otcl supports this
through the use of special classes that the C++ side can instantiate. But
this means that the original C++ code must be modified to create instances of
these special classes instead of its normal behavior. This limitation isn't
present in our approach. Otcl also lacks support for overloaded methods which
we support.

While Otcl may not provide as tight a wrapping of C++ classes, it does
provide much more flexibility in the Tcl domain. C++ classes can be
subclassed in the Tcl domain, passed back and forth, and methods added or
overridden. Our work does not support any object oriented extensions or
subclassing from the Tcl domain.

3. Methods

There are a number of difficult issues to be resolved in automatically
wrapping C++ into Tcl. We will examine them in the order that we dealt with
them. The first problem that we ran into was parsing our C++ code. In order
to wrap it into Tcl, we needed a method of parsing the code and breaking it
into its lexical components. To do this we decided to use the traditional
tools LEX and YACC.

As it turns out, writing LEX and YACC code for C++ is not an easy task since
C++ has so many features. There were two ideas that enabled us to get by. The
first was our philosophy regarding C++. We believed in using only the core
features of C++ within a relatively rigid development environment. We did not
use templates, run time type checking or exceptions. Our coding standards
dictated that all class names start with a common prefix (vtk) a limit of one
class per file, and a set of simple macros for performing common set/get
methods. With these restrictions we were able to develop a LEX and YACC based
program that could parse our header files and extract the pertinent
information. We did not make an attempt to handle preprocessor directives,
which means that our wrapping of a class is done based on that class's header
file alone, without looking at its superclasses' header files.

The first step in wrapping a class library is to determine which classes are
concrete and which are abstract. In object-oriented terminology, abstract
classes define an API for their subclasses. They are not meant to be
instantiated and under some circumstances it can be a compiler error. To
resolve this we modified our LEX and YACC code to parse the header files and
look for any pure virtual functions, which indicate that a class is
abstract. We then create a list of abstract classes and concrete
classes. This approach is not fool proof. There may be a class whose
superclass is abstract, that hasn't made itself concrete by providing a
concrete implementation of the pure virtual function of the superclass. Since
we judge each class based solely on its header file, we cannot detect this
condition. In this case, which has happened only twice in our now 360
classes, we must manually specify that the class is abstract.

For all of the concrete classes we run a second LEX/YACC program to generate
the required Tcl code to create a new instance. We use the same model as the
Tk widgets in that the class name serves as the command to create an instance
of that class. In our VTK_Init function, we simply create a command using
Tcl_CreateCommand for every concrete class in the library. All of these
commands invoke the same function vtkNewInstanceCommand, which checks to make
sure that the instance name was specified and is unique. It then creates a
new command with the same name as the instance, and attaches it to a function
that is responsible for handling methods on that class. It also creates a C++
instance of the desired class and associates the Tcl instance name with the
Tcl method function for that class and the C++ object pointer. We use three
Tcl hash tables to store these associations.

The code sample in Listing 1, shows a small portion of the
vtkNewInstanceCommand function. In this example it shows the logic for
creating an instance of the class vtkActor. First it does some quick checks
to make sure that the desired instance name is specified, starts with a
letter, and isn't already in use. Then it does a string compare for argv[0]
against "vtkActor". If it matches, it calls a function called
vtkActorNewCommand which will create an instance of vtkActor and then return
it as type ClientData. The next two lines create an association between the
Tcl name for that instance and the actual C++ pointer address. This is stored
in the vtkInstanceLookup hash table. The next three lines perform a similar
task, associating the pointer address with the Tcl instance name. The next
line creates a Tcl command with the same name as this instance. It also
associates the vtkActorCommand function with this instance to handle any
method invocations. The final two lines create an association between the Tcl
instance name and the function that should be used for invoking methods.

Listing 1. C++ instance creation code for Tcl.

int vtkNewInstanceCommand(ClientData cd, Tcl_Interp *interp,
                          int argc, char *argv[])
{   
  Tcl_HashEntry *entry;
  int is_new;
  char temps[80];

  if (argc != 2)
    {
    interp->result = "vtk object creation requires one argument, a name.";
    return TCL_ERROR;
    }

  if ((argv[1][0] >= `0')&&(argv[1][0] <= `9'))
    {
    interp->result = "vtk instances names must start with a letter.";
    return TCL_ERROR;
    }

  if (Tcl_FindHashEntry(&vtkInstanceLookup,argv[1]))
    {
    interp->result = "a vtk instance with that name already exists.";
    return TCL_ERROR;
    }

  /* Create an instance of vtkActor */
  if (!strcmp("vtkActor",argv[0]))
    {
    ClientData temp = vtkActorNewCommand();
    entry = Tcl_CreateHashEntry(&vtkInstanceLookup,argv[1],&is_new);
    Tcl_SetHashValue(entry,temp);
    sprintf(temps,"%x",(void *)temp);
    entry = Tcl_CreateHashEntry(&vtkPointerLookup,temps,&is_new);
    Tcl_SetHashValue(entry,(ClientData)(strdup(argv[1])));
    Tcl_CreateCommand(interp, argv[1], vtkActorCommand, temp,
                      (Tcl_CmdDeleteProc *)vtkTclGenericDeleteObject);
    entry = Tcl_CreateHashEntry(&vtkCommandLookup,argv[1],&is_new);
    Tcl_SetHashValue(entry,(ClientData)(vtkActorCommand));
    }

  /* Create an instance of vtkAppendPolyData */
  if (!strcmp("vtkAppendPolyData",argv[0]))
    {
     ...
    }
  ...
}

The next step in the process creates the method functions for all of the
classes. Where we only allow the user to create instances of concrete
classes, we must provide support for invoking methods from both concrete
classes or their abstract superclasses. This is because some of the methods
of a concrete class may be implemented in an abstract superclass. So we must
wrap methods of the abstract classes even though we do not allow the user to
create a direct instance of one. The method function serves to connect the
Tcl commands with their string arguments, to the C++ method
invocations. There are four main operations that occur in these functions:
typecasting which we will discuss later, chaining up the class hierarchy to
search for unresolved methods, additional methods that are specific to the
Tcl interpreted environment, and the bulk of the code serves to handle the
invocation of the C++ methods.

Chaining up the hierarchy is handled in a simple manner. If a method wasn't
found in the current function, we then invoke the same Tcl command, but this
time using the superclass's method function. This continues until either the
method is found or the top of the hierarchy is reached. For classes with
multiple inheritance we perform a depth first search of the inheritance
tree. If we cannot find a method that matches then we return TCL_ERROR. Since
Tcl is an interactive language, we use the method function to add two
additional commands to each class. We provide a hook for the Tcl user to
invoke the PrintSelf method on a class and have it be returned in
interp->result. We also added a command to list all the possible methods you
can send to a class and its superclasses.

As we mentioned above, the bulk of the method function handles wrapping the
C++ methods. From the C++ header file we obtain the names of the methods and
the number and type of arguments that they take and return. For each method
we first compare the Tcl method name, stored as argv[1], with the target
method. We then check to see if the number of arguments matches. After this
we start taking apart the Tcl string arguments and converting them into the
appropriate C++ arguments. For the traditional C data types we use the
standard Tcl functions such as Tcl_GetDouble. For passing C++ objects we use
our own function. If any of the argument conversions fail, we assume that
this wasn't the correct method and we continue searching for a match. This is
how we handle overloaded functions. Two functions may have the same name and
the same argument count, but the string arguments passed in from Tcl may
convert correctly for one method and not the other, effectively
disambiguating them. The return value of a method is converted into a string
and then returned in interp->result. The example in Listing 2, from
vtkActor's method function will help clarify this process.

Listing 2. Excerpt from vtkActor's method function.

int vtkActorCppCommand(vtkActor *op, Tcl_Interp *interp,
                       int argc, char *argv[])
{
  int    tempi;
  double tempd;
  static char temps[80];
  int    error;

  // some type conversion routines etc
  ...

  // check for an invocation of the SetPosition method
  if ((!strcmp("SetPosition",argv[1]))&&(argc == 5))
     {
     float    temp0;
     float    temp1;
     float    temp2;
     error = 0;

     if (Tcl_GetDouble(interp,argv[2],&tempd) == TCL_OK)
       {  temp0 = tempd; }
     else
       {  error = 1;  }
     if (Tcl_GetDouble(interp,argv[3],&tempd) == TCL_OK)
       {  temp1 = tempd;  }
     else
       {  error = 1;  }
     if (Tcl_GetDouble(interp,argv[4],&tempd) == TCL_OK)
       {  temp2 = tempd;  }
     else
       {  error = 1;  }
     if (!error)
       {
       op->SetPosition(temp0,temp1,temp2);
       interp->result[0] = `\0';
       return TCL_OK;
       }
     }

  // check other methods
  ...

  // If we haven't found a match, try chain up the superclasses
  if (vtkObjectCppCommand((vtkObject *)op,interp,argc,argv) == TCL_OK)
    {
    return TCL_OK;
    }

  ...
}

The first line checks to make sure the method name matches and that the
argument count is correct. Remember that the first Tcl argument is the
instance name, the second is the method name and the third Tcl argument is
the first C++ argument. So if we need three C++ arguments then we check to
see if argc is five. The next series of statements try to convert the Tcl
arguments into the desired type for C++, in this case floats. If all the
arguments convert correctly, we invoke the function and return TCL_OK. In
this example the C++ method does not have a return value. The C++ instance,
stored in the variable op, is passed in as the ClientData from Tcl.

The first concern with this approach is that C++ header files do not provide
enough information to correctly convert to and from Tcl strings. The classic
example of this is a method that returns a pointer to a float. The C++ header
file provides no information about how many elements are in that array. In
some cases the correct size may not be known until runtime, but for many
cases the return size is fixed and should be used in the wrapping process.

To support this we create a hint file that contains a series of lines of the
form: class name, method name, return type and expected size. When the
wrapping code encounters a method that it normally couldn't wrap, it checks
the hint file to see if the required information is in there. As it turns
out, our hint file is only one or two pages long. This is because many of our
methods that return fixed size arrays are done with macros which include the
size of the result in the macro call. These macros are typically Set/Get
methods for instance variables such as the (X, Y, Z) position of an
actor. These same macros allow us to pass arrays into C++ methods since the
required size of the array is known at the time of wrapping.

Passing C++ objects to and from methods is where a great portion of the
complexity comes from. A string passed to a method that is expecting an
object can be looked up in the hash tables defined earlier. If it is found,
we then have to perform a proper C++ typecasting to the desired argument
type. For example, say that you have a class B that is a subclass of class
A. You have created an instance of B named foo from the Tcl interpreter, and
you want to pass it into a method that takes an argument of type A. To do
this we first look up the instance foo in our hash tables to get the C++
object pointer. Then we invoke the method function for that class (B in this
example) asking it to perform a typecasting to a result of type A. We obtain
the method function by looking it up in a hash table. If the typecasting is
possible, e.g. if B is really a subclass of A, then the correct pointer is
returned and the method will be invoked. Otherwise an error occurs and method
resolution continues as with any other argument mismatch.

An example illustrates why this typecasting is so important. Given the
following piece of C++ code where B is a subclass of A, you might expect that
bar and foo would point to the same location in memory, but in many cases
they will not. Doing a blind typecasting of a C++ object obtained from the
hashtable into the desired argument type will create all sorts of problems,
especially since no checking will be done to ensure that the two C++ types
are even compatible.

class A ...; 
class B: A ...;  
B *foo; 
A *bar = foo;

It also happens that some C++ methods create instances of C++ classes and
return them. In this case, there will not be a Tcl string name for the
instance since it was created on the C++ side. When a C++ method returns a
C++ class pointer, we look up that pointer in the hash tables. If it has an
associated name, then we return that string. If it doesn't then we generate a
name of the form vtkTemp1 or vtkTemp2 etc. We then enter that name into the
hash tables as if the user had created the instance. This is commonly used in
situations like the following Tcl script:

set activeCamera [aRenderer GetActiveCamera];

We set the value of the variable activeCamera to the generated name returned
from the GetActiveCamera method. When it comes to freeing instances, we use
the rule that if the instance was created from the Tcl side, then we will
free the C++ side when the Tcl side is freed. If the instance was created on
the C++ side, then we require that whatever class created the instance, also
free it.

Now that we have discussed how we pass primitive data types, arrays and C++
objects, there is one more type of argument that is worth mentioning: the
user defined function. User defined functions are essentially callbacks that
certain classes support. The user specifies a function to be called and an
argument to pass to that function whenever a certain event occurs. With a Tcl
wrapping we don't want to pass a function pointer, but rather a string of Tcl
commands to be executed. So we package up the string and the interpreter that
sent the command into a structure, and set that as the argument to be sent to
the function. We then specify that the user defined function call the generic
function below. This function takes apart the structure and invokes
Tcl_GlobalEval to execute the string of Tcl script.

void vtkTclVoidFunc(void *arg)
  {
  vtkTclVoidFuncArg *arg2;
  arg2 = (vtkTclVoidFuncArg *)arg;
  Tcl_GlobalEval(arg2->interp,
                 arg2->command);
  }

From Tcl, a user defined function can be setup in the following
Status: RO

straightforward manner.

vtkContourFilter c;
c SetEndExecuteFunction {puts "Done"}

4. Results

Using the combination of techniques described above we have developed an
automated technique for wrapping a C++ library into Tcl. We have used this
for over a year in developing The Visualization Toolkit which now contains
over three hundred classes and thousands of methods. The process has required
almost no human interaction. The most common issue is adding a few methods to
the hint file now and then. We managed to do this with making only a couple
changes to our code. There were a few classes that had helper classes defined
in the same header file. To handle this we added a special comment as below:

// BTX - short for begin Tcl exclude
   code that breaks our parser in here
// ETX - short for end Tcl exclude 

Taking advantage of the dynamic loading features of Tcl7.5 we have wrapped
additional C++ libraries into Tcl so that they can be loaded at run time by
the scripts that require them. At this point our automated technique wraps
about 98% of the possible methods within VTK and generates about 100,000
lines of source code. We have had success applying the same approach to wrap
VTK into Java, although some of the issues are different. This ability to
generate wrapper code for Tcl or Java is a nice feature.

Unfortunately this technique does not leverage any of the recent object
oriented extensions to Tcl (such as Otcl) and it is not a general purpose
technique that can wrap any C++ code. As nice as that would be, it was
through our moderated use of C++ that we were we able to successfully wrap
the Visualization Toolkit and a few additional C++ libraries into Tcl. In the
future it might be worth considering using our approach to parse the C++
header files and then generate CDL files for Otcl. This would remove one of
the primary obstacles to our use of Otcl.

Listing 3, shows a simple example of using the C++ Visualization Toolkit from
within Tcl. More information on The Visualization Toolkit can be found at
https://www.cs.rpi.edu/~martink. While our wrapper generator isn't suitable
for many C++ class libraries, you can obtain a copy of it by emailing a
request to martink@crd.ge.com.

Listing 3. A simple Tcl script using the wrapped Visualization Toolkit.

# tcl code to draw a cube
# create a few instances of vtk classes
vtkRenderMaster renMaster; 
vtkCubeSource cubeSrc; 
vtkPolyMapper cubeMap; 
vtkActor cube1;

# create the rendering window and renderer 
set renWin [renMaster MakeRenderWindow];
set ren1   [renWin MakeRenderer];

# connect the pieces and draw the result 
cubeMap SetInput [cubeSrc GetOutput];
 cube1 SetMapper cubeMap; 
$ren1 AddActor cube1; 
$renWin Render;

I would like to acknowledge Bill Lorensen and Will Schroeder for their
suggestions, encouragement and help in this work.

References

[1] P. Bogdanovich. "Objective-Tcl: An Object Oriented Tcl Environment"
Proceedings of the Tcl/Tk Workshop, Toronto, Ontario, Canada, July 6-8, 1995.

[2] W. E. Lorensen, B. Yamrom. "Object Oriented Computer Animation."
Proceedings of IEEE NAECON, 2:588-595, Dayton Ohio, May 1989.

[3] M. J. McLennan., "[incr Tcl]: Object-Oriented Programming in Tcl"
Proceedings of the Tcl/Tk Workshop, University of California at Berkeley,
June 10-11, 1993.

[4] M. J. McLennan, "The New [incr Tcl]: Objects, Mega-Widgets, Namespaces
and More" Proceedings of the Tcl/Tk Workshop, Toronto, Ontario, Canada, July
6-8, 1995.

[5] W. Schroeder, K. Martin, B. Lorensen. The Visualization Toolkit: An
Object Oriented Approach to 3D Graphics. Prentice-Hall, Englewood Cliffs, NJ,
1996.

[6] D. Sheehan. "Interpreted C++, Object Oriented Tcl, What next?"
Proceedings of the Tcl/Tk Workshop, Toronto, Ontario, Canada, July 6-8, 1995.

[7] D. Wetherall, C. J. Lindblad. "Extending Tcl for Dynamic Object-Oriented
Programming" Proceedings of the Tcl/Tk Workshop, Toronto, Ontario, Canada,
July 6-8, 1995.