USENIX - USENIX'97 Conference Summaries

Libcdt: A General and Efficient Container Data Type Library

By Kiem-Phong Vo, AT&T Labs-Research

Summary by Gordon Galligher

Containers for data are used in nearly every application on the planet, and in each instance, the particular storage mechanism is typically rewritten. The authors of Libcdt wanted a more generic container data type that was flexible and portable enough to be used in any number of situations. It had to support multiple data types, as well as dynamically defining those data types. It also needed to support having the same objects stored in different containers.

Prior to Libcdt, there were things such as the tsearch and bsearch algorithms, which were messy with no common interface among them, and they work only on specific types of data. The "Map" and "Set" capabilities of C++ are better, but they do not support a uniform interface or the mixing of container data types, and the performance was not stellar. Research into the C++ Standard Template Library (STL) showed that although it had a common interface, it did not support mixing the use of containers, and it was difficult to change features during runtime. As a result, the authors decided that it was necessary to create their own data type.

The Container Data Type (CDT) Methods and Disciplines Library was born. This library provides a common interface to the programmer for various data types and storage mechanisms; it also supports the ability to dynamically change the storage mechanisms at runtime. This dynamic storage mechanism is very powerful and gives the programmer the ability to optimize various parts of the program whenever needed. The CDT Library is managed via three data types: the dictionary, which stores ubjects; the method, which defines how objects are stored; and a discipline, which is application-specific and determines the structure of the objects for comparison, hashing, allocation, and event reporting.

The tests of the CDT Library outperformed the other routines mentioned by a factor of two to four for data sets that were not ordered/sorted. The CDT Library also outperformed all other routines by a factor of three to six on ordered data sets. Perhaps the most surprising aspect of the tests is that the "ordered set" method of CDT outperformed the hashed method of the routines mentioned previously.

The code for the CDT Library is not publicly available for download, but it can be requested by sending electronic mail to kpv@research.att.com.

Originally published in ;login: Vol. 22, No.2, April 1997.

webster@usenix.org
Last changed: May 28, 1997 pc