by Glen McCluskey Glen McCluskey is a consultant with 15 years of experience and has focused on programming languages since 1988. He specializes in Java and C++ performance, testing, and technical documentation areas.
This column covers some miscellaneous topics related to using C++ as a better C. Mixing C++ and C Code One of the common issues that always comes up with programming languages is how to mix code written in one language with code written in another. For example, suppose that you're writing C++ code and wish to call C functions. A common case of this would be to access C functions that manipulate C-style strings, for example strcmp() or strlen(). So as a first try, we might say: extern size_t strlen(const char*); and then use the function. This will work, at least at compile time, but will probably give a link error about an unresolved symbol. The reason for the link error is that a typical C++ compiler will modify the name of a function or object ("mangle" it), for example to include information about the types of the arguments. As an example, a common scheme for mangling the function name strlen(const char*) would result in: strlen__FPCc There are two purposes for this mangling. One is to support function overloading. For example, the following two functions cannot both be called "f" in the object file symbol table:
int f(int);
But suppose that overloading was not an issue, and in one compilation unit we have: extern void f(double); and we use this function, and its name in the object file is just "f". And suppose that in another compilation unit the definition is found, as: void f(char*) {} This will silently do the wrong thing a double will be passed to a function requiring a char*. Mangling the names of functions eliminates this problem, because a linker error will instead be triggered. This technique goes by the name "type safe linkage." So to be able to call C functions, we need to disable name mangling. The way of doing this is to say: extern "C" size_t strlen(const char*); or:
extern "C" { This usage is commonly seen in header files that are used both by C and C++ programs. The extern "C" declarations are conditional based on whether C++ is being compiled instead of C. Because name mangling is disabled with a declaration of this type, usage like:
extern "C" { is illegal (because both functions would have the name "f"). Note that extern "C" declarations do not specify the details of what must be done to allow C++ and C code to be mixed. Name mangling is commonly part of the problem to be solved, but only part. There are other issues with mixing languages that are beyond the scope of this presentation. The whole area of calling conventions, such as the order of argument passing, is a tricky one. For example, if every C++ compiler used the same mangling scheme for names, this would not necessarily result in object code that could be mixed and matched. Declaration Statements In C, when you write a function, all the declarations of local variables must appear at the top of the function or at the beginning of a block:
void f()
Each such variable has a lifetime that corresponds to the lifetime of the block it's declared in. So in this example, x is accessible throughout the whole function, and y is accessible inside the while loop. In C++, declarations of this type are not required to appear only at the top of the function or block. They can appear wherever C++ statements are allowed:
class A { and so on. Such a construction is called a declaration statement. The lifetime of a variable declared in this way is from the point of declaration to the end of the block. A special case is used with for statements:
for (int i = 1; i <= 10; i++) In this example the scope of i is the for statement. The rule about the scope of such variables has changed fairly recently as part of the ANSI standardization process, so your compiler may have different behavior. Why are declaration statements useful? One benefit is that introducing variables with shorter lifetimes tends to reduce errors. You've probably encountered very large functions in C or C++ where a single variable declared at the top of the function is used and reused over and over for different purposes. With the C++ feature described here, you can introduce variables only when they're needed. Character Constants There are a couple of differences in the way that ANSI C and C++ treat character constants and arrays of characters. One of these has to do with the type of a character constant. For example:
#include <stdio.h>
If this program is compiled as ANSI C, then the value printed will be sizeof(int), typically 2 on PCs and 4 on workstations. If the program is treated as C++, then the printed value will be sizeof(char), defined by the draft ANSI/ISO standard to be 1. So the type of a char constant in C is int, whereas the type in C++ is char. Note that it's possible to have sizeof(char)==sizeof(int) for a given machine architecture, though not very likely. Another difference is illustrated by this example:
#include <stdio.h>
This is legal C, but invalid C++. The string literal requires a trailing \0 terminator, and there is not enough room in the character array for it. This is valid C, but you access the resulting array at your own risk. Without the terminating null character, a function like printf() may not work correctly, and the program may not even terminate. Function-style Casts In C and C++ (and Java), you can cast one object type to another by usage like:
double d = 12.34;
Casting in this way gets around type system checking. It may introduce problems such as loss of precision, but is useful in some cases. In C++ it's possible to employ a different style of casting using a functional notation:
double d = 12.34;
This example achieves the same end as the previous one. The type of a cast using this notation is limited. For example, saying: unsigned long*** p = unsigned long***(0); is invalid, and would need to be replaced by:
typedef unsigned long*** T;
or by the old style: unsigned long*** p = (unsigned long***)0; Casting using functional notation is closely tied in with constructor calls. For example:
class A { causes an A object local to f() to be created via the default constructor. Then this object is assigned the result of constructing an A object with 37 as its argument. In this example there is both a cast (of sorts) and a constructor call. If we want to split hairs, a perhaps more appropriate technical name for this style of casting is "explicit type conversion." It is also possible have usage like:
void f() If this example used a class type with a default constructor, then the constructor would be called both for the declaration and the assignment. But for a fundamental type, a call like int() results in a zero value of the given type. In other words, i gets the value 0. The reason for this feature is to support generality when templates are used. There may be a template such as:
template <class T> class A { and it's desirable that the template work with any sort of type argument.
|
|
First posted: 13th April 1998 efc Last changed: 13th April 1998 efc |
|