Check out the new USENIX Web site.

Sharing Between Translation Units in C++ Program Databases


Samuel C. Kendall                       Glenn Allin
Sun Microsystems Laboratories, Inc.     CenterLine Software, Inc.
2 Elizabeth Drive                       10 Fawcett Street
Chelmsford, MA\x1301824                 Cambridge, MA\x1302138
sam.kendall@east.sun.com                glenn@centerline.com

Abstract

A C++ program database represents information about C++ code, typically to enable program browsing or debugging. Such databases can grow very large. The growth is fundamentally due to the translation unit (TU) program structure C++ inherited from C: a naively designed database will consist largely of representations of redundant or unused code from header files. This paper measures the effect of some techniques for shrinking this naively designed database: the elision of unused entities from a TU, and the sharing or linking (generically, the combination) of redundant entities across TUs. We also measure the overhead imposed by the segregation of class types with external linkage from those with internal linkage. We define and measure these techniques for our own database, which was designed with very specific requirements. We also discuss techniques and organizations used in other program databases to save space: sharing at header file granularity; ruthless simplification of the database; and lazy loading of data into the database. Finally, we note the potential problems associated with independently implemented translators feeding into the same database.


Download the full text of this paper in ASCII (45,535 bytes) and POSTSCRIPT (155,381 bytes) form.

To Become a USENIX Member, please see our Membership Information.