Using Tcl to Control a Computer-Participative Multimedia Programming Environment Christopher J. Lindblad Telemedia Networks and Systems Group Laboratory for Computer Science Massachusetts Institute of Technology Abstract This paper describes how the VuSystem, a programming environment for the development of computer-participative multimedia applications, is controlled through Tcl scripts. In the VuSystem, networks of in-band media-processing modules are created and controlled by interpreted out-of-band Tcl scripts through object commands and callbacks. Tcl's extensibility, simple type system, efficient interface to C, and introspective capabilities are used by the VuSystem to produce a highly dynamic and capable media processing system. Introduction The VuSystem [VuSystem,Lindblad94] is a programming environment for the development of computer-participative multimedia applications. It is designed to run on high performance computer systems not specifically designed for the manipulation of digital video. The system is unique in that it combines the programming techniques of visualization systems and the temporal sensitivity of traditional computer-mediated multimedia systems. VuSystem code is split into two partitions: one which does traditional out-of-band processing and one which does in-band processing. Out-of-band processing is that processing which performs event handling and other higher-order functions of a program. In-band processing is the processing performed on every video frame and audio fragment. This architecture differs from that of traditional multimedia toolkits that either provide a low level language interface to a library of primitives [Quicktime93,VFW], or provide a high level language system with limited extensibility [Hypercard93]. In the VuSystem, in-band code can be written in a low level language and can be optimized for performance, while out-of-band code can be written in a high level language and optimized for programmability, usability and extensibility. In the VuSystem, the in-band processing partition is arranged into processing modules which logically pass dynamically-typed data payloads though input and output ports. These in-band modules can be classified by the number of input and output ports they possess. The most common module classifications are sources, with no input ports and one output port; sinks, with one input port and no output ports; and filters with one input port and one output port. In-band media-processing code is more elaborate and extensible in the VuSystem than in traditional computer-mediated multimedia systems because VuSystem applications perform more analysis of their input media data. The nature of out-of-band processing is very different from in-band processing. For the out-of-band code, a programming system can be chosen that can handle user interfaces and other event-driven program functions well. When designing the out-of-band partition, programmability is more important than performance. For maximum ease of application development, the out-of-band partition of VuSystem is programmed in an interpreted scripting language. Application code written in this scripting language is responsible for creating and controlling the network of in-band media-processing modules, and controlling the graphical user-interface of the application. The scripting language used in the VuSystem is the Tool Command Language, or Tcl [Ousterhout94]. Application code written in Tcl is responsible for creating and controlling the network of in-band media-processing modules, and controlling the graphical user-interface of the application. In-band modules are manipulated with object commands, and in-band events are handled with callbacks. The VuSystem is implemented on Unix workstations as a shell-like program that interprets an extended version of Tcl. In-band modules are implemented as C++ classes and are linked into the shell. Simple applications that use the default set of in-band modules are written as Tcl scripts. More complicated applications require linking in additional in-band modules to the default set. Several computer-participative multimedia applications have been developed with the VuSystem [VsApps]. The Room Monitor and The Sports Highlight Browser are two representative examples. The Room Monitor processes video from a stationary camera to determine if a room is occupied or empty, and records only when activity is detected above some threshold, producing a series of video clips that summarize the activity in the room. The Sports Highlight Browser provides access to a sporting news telecast that has been automatically segmented into a set of video clips, each of which represents highlights of a particular sporting event. VuSystem programs have a media-flow architecture: code that directly processes temporally sensitive data is divided into processing modules arranged in data processing pipelines. This architecture is similar to that of some visualization systems [Khoros,AVS], but is unique in that all data is held in dynamically-typed time-stamped payloads, and programs can be reconfigured while they run. Timestamps allow for media synchronization. Dynamic typing and reconfiguration allows programs to change their behavior based on the data being fed to them. The Tool Command Language The Tool Command Language is an excellent programming language for assembling modules into flexible applications. Tcl is designed as a simple but extensible command language. Its syntax is concise enough that simple Tcl commands can just be typed in, but it is programmable and powerful enough that most of the control logic of a large application can be written in it. It has a simple and efficient interpreter, and a simple interface to C. Tcl syntax is similar to that of Unix shells, but it has additional Lisp-like constructs: Tcl uses curly braces to group elements, square brackets to invoke command substitution, and dollar signs to invoke variable substitution. Tcl's Simple Type System In Tcl, all commands and values are strings. There is no other data type. It has no native representation of numbers or lists. All data is in the form of character strings. Even Tcl commands themselves are strings. Since all data in the Tcl interpreter are strings, the embedding interface is simplified. It is easy to pass data between the Tcl interpreter and C code in an application. The C code need only be able to convert internal objects to and from strings. No library of converters between multiple representations is required. Data types that can be easily represented in string form are quite natural to use within Tcl. For example, numbers can be easily converted to and from strings using standard mechanisms, and lists can be easily represented as strings, using curly braces for grouping. Data types too complex to be efficiently converted to and from strings can be represented in Tcl with handles or object commands. Representing Complex Objects in Tcl Some objects of data types too complex to be efficiently converted to and from strings can be represented with string handles. Primitive commands that manipulate these objects use standard methods to convert string handles to objects, and from objects to string handles. A good example of objects that are represented by string handles are open files. The standard Tcl library provides commands to open, close, read and write files. These primitives use file handles, short strings that can be converted to and from file descriptors. The most powerful way to represent objects too complex to be efficiently converted to and from strings is through object commands. In this approach, for each object of a complicated data type, a unique Tcl Command is registered in the interpreter. Operations on an object are performed by invoking its Tcl command, with the first argument to the command specifying the operation, and the rest of the arguments specifying the arguments to the operation. The Tk graphical user interface toolkit uses object commands to manipulate widgets [Ousterhout94]. Object commands are used by the VuSystem to manipulate media processing modules. Each module is manipulated with its own object command. Each object command has several subcommands that allow the state of its object to be queried and changed. VuSystem object commands are constructed using Object Tcl [Wetherall94], a dynamic object-oriented extension to Tcl that was developed for this purpose. Manipulating Modules VuSystem media processing modules are created in Tcl with class commands and manipulated with object commands. A class command is defined for each type of module that can be created, and an object command exists for each module created. For example, the VsWindowSink Tcl command creates a VsWindowSink module, and installs a new command in the Tcl interpreter to control the module. The VsWindowSink Tcl command takes as its first argument the name of the object command to create. The rest of the arguments to the class command are parameters for the new module. Module Types VuSystem modules are best categorized by how many input and output ports they have. A module is either a source, a sink, a filter, or some other module. Modules with no input ports and one output port are called Sources because they appear to the VuSystem to source data. Sources typically interface to media capture devices or media storage systems. Audio sources interface to audio capture hardware. Video sources interface to video capture hardware. File sources interface to files. More exotic sources exist as well. Modules with one input port and no output ports are called Sinks because they appear to the VuSystem to sink data. Sinks typically interface to media playback devices or to media storage systems. Audio sinks interface to audio playback hardware. Video sinks interface to video playback hardware. File sinks interface to files. Modules with one input port and one output port are called filters because they are typically used to perform signal processing operations on the data flowing through them. Compression filters compress or de-compress video frames. Pixel format conversion filters convert the format video frames. Descriptor filters perform operations on the descriptors of payloads. Visual processing filters perform various functions on video data. Modules with more than one input or output ports provide mechanisms for splitting and merging payload sequences. Some modules with one input port and many output ports split a single timestamped sequence of payloads into multiple sequences. Some modules with many input ports and one output port merge multiple sequences into a single sequence. Others combine payload sequences from two input ports to one output payload sequence, operating on the contents of the data. Communication Between In-Band and Out-Of-Band In the VuSystem, out-of-band Tcl scripts and in-band C++ modules communicate through object commands and callbacks: Out-of-band code is able to create and destroy in-band modules, query the state of in-band modules, and give commands to in-band modules, all through special Tcl object commands defined for each in-band module and port. In-band media-processing code signals out-of-band Tcl code whenever an appropriate in-band event occurs through Tcl callbacks. Object commands are always completed in the in-band partition synchronously with the out-of-band requester: object commands execute immediately and completely when called from out-of-band scripts. In contrast, because of the time-critical nature of in-band code, it is unacceptable for in-band code to wait for a response from out-of-band code. Tcl callbacks are executed in the out-of-band partition asynchronously with the in-band partition: callbacks are only queued for execution when invoked from in-band code. Later, the VuSystem scheduler actually executes them. Since out-of-band callbacks do not execute immediately when they have been signalled by in-band code, they are only used to signal events to the out-of-band code, and cannot return values to their in-band signallers. Any in-band changes made by an out-of-band callback are performed through object commands. Object Subcommands Each module object command has a set of subcommands that vary according to type of the module. The internal state of the module can be queried and changed with some of these subcommands. For example, the VsVidboardSource module has a port subcommand that is used to control from which input port it captures video. Subcommands are implemented as normal Tcl command procedures, whose client data argument by default is a pointer to the associated module. Subcommands are declared friend procedures to the module class, so they may manipulate private members of a module. Callbacks In-band modules process continuous sequences of media data, while out-of-band Tcl control processing deals with events. Sometimes, out-of-band Tcl code in an application should be executed when an in-band event occurs. In this case, an in-band module would call a Tcl callback. The VuSystem provides a facility for each module to have a callback, which can be called whenever a specific event occurs during in-band processing. For example, the VsFileSource module calls its callback when it reaches end-of-file. Callbacks are defined in Tcl. Typically the Tcl application programmer uses a name of a Tcl procedure as the callback command. The Tcl procedure looks at its arguments to determine what event has occurred. Tcl callback commands are installed with the callback subcommand. Each module can have only one callback installed at a time. If a module can signal more than one type of event, it supplies keyword arguments to the callback command, so the command can determine which event occurred. Example proc sourceCallback {args} { set sourceEnd [keyarg -sourceEnd $args 0] if $sourceEnd { vs.source pathname "second.uv" vs.source callback "" } } SimpleFileSource vs.source \ -pathname "first.uv" \ -callback "sourceCallback" The above code shows how a Tcl application programmer might make use of a simple file source callback that indicates end-of-file. This example code provides the automatic switching of the file source from the file first.uv to the file second.uv when end-of-file is encountered on the first file. The sourceCallback procedure takes a keyword argument list in its args parameter. It extracts any -sourceEnd keyword parameter with the keyarg command, defaulting to 0. If sourceEnd is nonzero, sourceCallback changes the file for the source module using the pathname subcommand for the module. This will cause the source module to start on the file second.uv. The sourceCallback procedure also clears the callback for the source module using the callback subcommand for the module so that when the end of second.uv is signalled, it does not run again. After defining the sourceCallback procedure, a SimpleFileSource named vs.source is created, with its input file set to first.uv and its callback set to sourceCallback. When started, vs.source will read from first.uv and evaluate the Tcl command string `` sourceCallback -sourceEnd 1'' when it encounters end-of-file. The VuSystem Application Shell The VuSystem is implemented as a Unix application shell: it is program that interprets an extended version of Tcl. Linked into the program are all standard in-band modules, implemented as C++ classes. Tcl scripts implement simple applications that use the default set of in-band modules. By linking additional in-band modules into the application shell, more complicated applications can be constructed. The application shell defines the interface between the primitive module and command developer and the application script developer. At the primitive level, the module developer creates new primitive VuSystem modules and primitive Tcl commands and links them into the VuSystem application shell. At the application level, the developer writes Tcl code that runs in the application shell. Programming the Graphical User Interface I implemented a Tcl interface to the X Window System Toolkit [xt] and the Athena widget set for the graphical user-interface code. At the start of VuSystem development, I chose to use Xt and the Athena widget over the Tk widget set provided with the Tcl distribution, because at that time the Xt intrinsics and Athena widget set were more robust and complete, and also had built-in features for scheduling in large applications. Through the scheduling interface provided by the Xt intrinsics, I provided scheduling to the in-band modules. Today, the VuSystem could be changed to use Tk, since these capabilities are now provided by Tk. The TclXt and TclXaw components of the VuSystem [Lindblad94] provide Tcl programming interfaces to the standard X Window System Xt and Xaw libraries. These components enable the Tcl programmer to construct graphical user interfaces based on the Xt toolkit and the Athena widget set. TclXt and TclXaw use object commands to manipulate X displays, application contexts, events, widgets, and widget classes. Widget resources and other object state can be manipulated through subcommands to these object commands. They provide an interface to the Athena widget set that is similar to Tk, as follows: Widget instances are created by invoking a WidgetClass command. For example, to create a button, one uses the Command Tcl command, which creates a Command widget. Initial values for widget resources are provided as keyword arguments to the class command. The standard string conversion facilities provided by the Xt intrinsics convert the resource values from strings. Resources of existing widgets can be queried and changed through widget subcommands. Just as for the initial values of these resources, the standard string conversion facilities provided by the Xt intrinsics convert the resource values to and from strings. Callbacks and Translations can be specified as Tcl commands. These commands are executed whenever the given input event occurs. TclXt and TclXaw provide a powerful and complete interface to the Athena widget set. They make the entire interface to the Xt and Xaw libraries acessible to the Tcl programmer. With TclXt and TclXaw, there is no library interface available to the C programmer that is not also available to the Tcl programmer. Example Application Script The following is a simple example application built out of a video source module, a video sink module, and a filter module. It implements a video version of a 16-square puzzle. It takes input from a camera or other video source, scrambles it, and then presents the output on a window. VidboardSource m.source Puzzle m.filter \ -input "bind m.source.output" WindowSink m.sink \ -window w.screen \ -input "bind m.filter.output" m start The Tcl script fragment that configures the modules and starts them running is shown above. This fragment is not complete, but captures the essence of the larger script. This script first creates an instance of a VidboardSource module and names it m.source. Then the script creates an instance of the Puzzle module and names it m.filter. It also instructs this module instance to connect its input port to the output port of the m.source module instance. The output port of m.source was automatically named m.source.output. Next, the script creates a WindowSink module, specifying with the -window option that the widget named w.screen is the window on the screen to use. The input port of the WindowSink module is also specified to be connected to the output port of the Puzzle module. Finally the parent module instance named m, created before this script fragment was run, is given the start command, causing all its child module instances, those with name m. whatever, to start. In addition to this script fragment, the puzzle application includes similar code to construct its user interface, and code to cause the m.puzzle module instance to reconfigure itself whenever the user moves a block in the puzzle. Extensions to the VuSystem A few projects are under way to extend the scope of the VuSystem. Work in progress includes a visual programming system for media computation and a distributed programming system for media applications. Interactive Programming A visual programming system for application users is nearing completion. Users interact with the system through a flow graph representation of the running program to control its media processing component. A ``flow graph'' perspective emphasizes the computation that occurs, rather than a ``hypermedia'' perspective, which may view the media in terms of a database to be navigated. Flow graph, or dataflow, representations have been used with success in prior visual languages [AVS]. The visual environment is suited to tasks such as customization, rapid prototyping and experimentation, as well as more general program development. It provides a programming ability (rather than a limited set of configuration options) to users, allowing them to re-program previously developed applications. By embedding it in a toolkit, consistent user programming facilities are available in all derived applications.