C++/c++
Expert: Ralph McArdell - 5/20/2008
QuestionThere are lot of disadvantages of macro, than why we use it?
AnswerIndeed. In fact you should not use macros in C++ unless you have no other choice as C++ has features that explicitly help reduce our need to resort to nasty macros.
As I mentioned in a previous answer for someone else at the end of his book "The Design and Evolution of C++" in chapter 18 Dr. Bjarne Stroustrup lists the following C++ alternatives to uses of preprocessor #define macros:
"
- const for constants
- inline for open subroutines
- template for functions parameterised by type
- template for parameterised types
- namespaces for more general naming
"
However there are a few places where we cannot do without macros. One is for conditional compilation of code. A very idiomatic example of this is with header file guards that prevent a header file being included more than one per translation unit (a translation unit is roughly what the compiler proper compiles after preprocessing):
#ifndef SOME_APPROPIATELY_LONG_AND_UNIQUE_MACRO_NAME
# define SOME_APPROPIATELY_LONG_AND_UNIQUE_MACRO_NAME
// contents of header file
#endif // SOME_APPROPIATELY_LONG_AND_UNIQUE_MACRO_NAME
The first time the header file is included the header guard macro (named SOME_APPROPIATELY_LONG_AND_UNIQUE_MACRO_NAME in my example) is not defined so the #ifndef (if not defined) conditional code inclusion preprocessor directive condition is true and the code between it and the #endif directive is included in the pre-processed source code. In doing so the header include guard macro is defined. Should the header file be included again the #ifndef condition will no longer be true and the body of the header file will not be included (a second or more time) in the pre-processed source code.
Such guards are required as although you might say you never directly include a header file more than once in your implementation source files how do you know for sure a file is not included more than once without checking the files included by the files you do include and ones they include etc.? In fact it may be impossible to prevent in the general case. That is you can avoid direct repeated inclusion but not indirect repeated inclusion.
Some compilers support a special option (pragma) such as once. #pragma is another preprocessor directive and is used to insert compiler specific options in the source code, so we can use:
#pragma once
// contents of header file
Instead of header include guards for headers used with compilers that support such a pragma. However #pragma once is _not_ standard and therefore cannot be relied upon to be supported by all compilers. In fact another compiler could have a subtly different but equivalent facility.
See
http://en.wikipedia.org/wiki/Include_guard for more on include guards.
Other such uses are for configuring code for use with multiple compilers and systems. For example the MS Windows API is very different from UN*X and Linux style APIs, which differ in more subtle ways among themselves. Compilers tend to define a set of non-standard macros which can help determine which compiler (make and version) is being used. Other techniques can be used to ascertain what operating system is being used - from explicitly defining one or more macros on the compiler command line to using tool such as the GNU auto tools (autoconf, automake etc.).
One example you might like to look at is the Boost (see
http://www.boost.org) config library (
http://www.boost.org/doc/libs/1_35_0/libs/config/doc/html/index.html). The Boost C++ libraries provide a very useful set of tools to the C++ programmer - some of these have made it into the C++ library proper with the TR1 library update to the ISO standard C++. However they tend to assume good compliance to standard C++ and push compilers to (and often beyond) their limits. The need to build and execute the Boost libraries on various systems and compilers (sometimes requiring compiler workarounds for missing or broken features) so has an extensive set of configuration macros.
Other uses of macros concern the fact that they are part of the standard - usually inherited by C++ from C. A couple of such are the NDEBUG and assert macros. The C assert macro can be used to check invariants, pre- and post- conditions and the like. In builds in which NDEBUG is _not_ defined the assert macro will generally terminate execution of the application usually with a message indicating where the assertion failed and often what the asserted condition was. Some debuggers can trap assertions and take you straight to the offending line. If NDEBUG is defined the assert does nothing (so never, never, never do real work in an assert).
Others are the __FILE__ and __LINE__ macros provided by the compiler so that source file name and line number information can be injected into error messages. Often these are wrapped up in a function like macro so the caller does not have to remember to use __FILE__ and __LINE__ explicitly. Doing so requires a macro so that the values of __FILE__ and __LINE__ are evaluated at the current file and line and not the value for the single place used for say a (real non-macro C/C++) function in which they were placed.
Other uses are cases where we wish macros to effectively write some code for us but other techniques are not available to do it (note: class and function templates have helped a lot to reduce such cases). One area is where you have a set of related data and they have to be used to create various constructs. Consider a table of strings and an enumeration type that represents indices into the table:
char const * const stringTable[] =
{ "String 0"
, "String 2"
, "String 4"
, "" // End stop
};
enum StringId
{ EString0
, EString2
, EString4
, EStringEnd // End stop
};
These are quite likely to be in different files. The ids for use by everyone and the strings only defined in a single place access by say by some function. The problem again is one of consistency over time. As strings are added to the table the StringId enum definition has to be updated and kept in sync with the strings. It is quite possible for someone to re-arrange the strings into some sort of more logical order but forget to update the StringId values or to not quite get the changes consistent across both constructs.
So we can use a macro listing the ids and strings as pairs as parameters to some operation macro:
#define STRING_TABLE_DEFINITIONS(op)\
op(EString0, "String0")\
op(EString2, "String2")\
op(EString4, "String4")\
op(EStringEnd, "" )\ // End stop - add strings above this line
We then define the string table like so:
# define STRING_TABLE_ENTRY(ID_, TEXT_) TEXT_,
char const * const stringTable[] =
{
STRING_TABLE_DEFINITIONS(STRING_TABLE_ENTRY)
NULL // end stop
};
# undef STRING_TABLE_ENTRY
# undef STRING_TABLE_DEFINITIONS
Notice that here we pass a macro as a parameter to another macro. In this case this has the effect of applying the STRING_TABLE_ENTRY operation macro to every line in the STRING_TABLE_DEFINITIONS macro expanding to the second parameter on each line (the string literals) followed by a comma. We need the NULL at the end of the table to make the definition syntactically correct other wise the last line would have a trailing comma, which is bad C++ syntax:
// ...
"", // Currently illegal
};
So we have to do this:
// ...
"",
NULL
};
(this is due to change in the next revision of the C++ standard for just such machine generated code cases).
Note that I undefine the STRING_TABLE_ENTRY and STRING_TABLE_DEFINITIONS macros as soon as possible.
The definition of the enumeration of the string ids is similar:
# define STRING_ID_ENUM_ENTRY(ID_, TEXT_) ID_,
enum StringId
{
STRING_TABLE_DEFINITIONS(STRING_ID_ENUM_ENTRY)
_EEndId_ // end stop: one past the end index value.
};
# undef STRING_ID_ENUM_ENTRY
# undef STRING_TABLE_DEFINITIONS
Except that we are defining an enum and the operation macro, STRING_ID_ENUM_ENTRY, expands each line of the STRING_TABLE_DEFINITIONS macro to be the first parameter passed to op followed by a comma - i.e. the string id name followed by a comma. Similarly we have to have an explicit last value and I undefine the STRING_TABLE_ENTRY and STRING_TABLE_DEFINITIONS macros as soon as possible.
A significant problem with this approach is that although it is likely that the string table is only defined in one implementation file the string ids will be defined in a header file included wherever access to table strings by id is required - which can be most of an application. Hence the file containing the STRING_TABLE_DEFINITIONS macro - which over time is likely to grow quite large - will be included in many places, possibly slowing build times. It also increases coupling between the module owning the string table logic and the rest of the system. One way around this would be to write a utility that took the STRING_TABLE_DEFINITIONS macro with the (string id, string literal) pairs as input and produced the StringId enum and stringTable definitions as output in two files. These could then be used throughout the rest of the application source code. The utility would be executed as part of a pre-compile build step. It might be worth while changing the format of the (string id, string literal) pairs to make them easier to process at that point.
A plus point is that other op macros can be devised - e.g. to help with unit testing.
For other uses of macros see the Wikipedia article at
http://en.wikipedia.org/wiki/C_preprocessor You should note which of these can be achieved using main C++ language facilities instead of using macros.
For some advanced C++ preprocessor usage you might like to examine the Boost Preprocessor library (
http://www.boost.org/doc/libs/1_35_0/libs/preprocessor/doc/index.html).
Once again, C++ has facilities that allow us to minimise the use of preprocessor macros in our code and we should take advantage of these facilities where ever possible in preference to using preprocessor macros.
If you do have to use macros then use a different naming style to any other identifier type and use long names to help prevent name collisions. You will notice that a common style is that preprocessor macro names are the only names that use all UPPER_CASE with language-proper names using lower_case, camelCase or PascalCase naming styles. The names of the macros in this answer are all of this UPPER_CASE style and reasonably long. In fact if they were really in a project of mine then they would probably be longer as I would prefix additional text such as company, project and module names to each macro name.
If you wish to see what happens when these conventions are not followed and you develop for the MS Windows platform and have the platform SDK installed try including windows.h and the C++ algorithm header and then try using std::min. (hint: windows.h defines a min macro...):
This will compile and link:
#include <algorithm>
#include <iostream>
int main()
{
int v( std::min(23, 45) );
std::cout << "Value = " << v << std::endl;
return 0; // for 'broken' MSVC 6.0
}
When run it will produce:
Value = 23
However, this will not even compile:
#include <windows.h> // Including windows.h defines a min macro
#include <algorithm>
#include <iostream>
int main()
{
int v( std::min(23, 45) );
std::cout << "Value = " << v << std::endl;
return 0;
}
Producing the following errors (from MSVC++ 2005):
error C2589: '(' : illegal token on right side of '::'
error C2059: syntax error : '::'
error C2065: 'v' : undeclared identifier
Hope you find this useful.