You are here:

C++/void POINTER

Advertisement


Question
Hi..
Can you explen me about void pointers in C++ in detail with example.

Answer
In C and C++ a void pointer, or void * type, is a pointer type that points to no particular type of object. As such you cannot de-reference a void pointer (i.e. use what it points at) because the compiler has no idea to what type of thing it is pointing to. Similarly you cannot perform pointer arithmetic of a void pointer because the compiler has no idea by how much it needs to modify a void pointer to change the pointer's value by the computed number of objects, because a void pointer points to no type of object the compiler cannot know the size to use for each object.

Pointer arithmetic example:

   char chars[2] = { 'a', 'b' };
   char * pChar = chars; // pChar points to 1st char in chars array

   ++pChar;  // Pointer arithmetic: pChar incremented by sizeof(char)

   int ints[2] = { 1, 2 };
   int * pInt = ints; // pInt points to 1st int in ints array

   ++pInt;  // Pointer arithmetic: pInt incremented by sizeof(int)

In the above example the incrementing of a pointer to char increments the pointer by the size of a character, which on typical modern hardware in which char is an 8-bit byte and memory is addressed in 8-bit bytes would increment pChar by 1 machine address. However incrementing a pointer to int, pInt, will change the value by the size of an int. On a 32-bit compiler in which int is 32-bits, or 4 bytes, in size, this will add 4 to the value of pInt.

Below are values from an execution of the above example code built using Microsoft Visual C++ 2005:

   pChar initial value: 0x0017ff30
   pChar after ++pChar: 0x0017ff31

    pInt initial value: 0x0017ff14
    pInt after ++pInt : 0x0017ff18

Notice, that as described, the value used in pointer arithmetic depends on the type pointed to, specifically the sized of that type. Now ask yourself: If we could use pointer arithmetic with pointers to void, what size should we use for void? 0? 1? Infinity? Something else?

In fact if we try it the compiler should give us an error. If for example we try:

   int ints[2] = { 1, 2 };

   void * pVoid = ints;  // Assign pVoid to point int array start

   ++pVoid;     // What size to use?

Building using the same MSVC++ compiler as before gave me:

   error C2036: 'void *' : unknown size

And just for completeness, if we try to de-reference the pointers used above to make use of what they point to, for example to print them out to the console:

   std::cout << *pChar;
   std::cout << *pInt;

Assuming the above statements follow the increments they will display:

   b2

As expected - they have been incremented to point to the next object in memory to which they were initially pointing.

However if we try it with pVoid:

   std::cout << *pVoid;

We find we again get an error, MSVC++ gives the slightly obscure:

   error C2100: illegal indirection

Until you realise that dereferencing a pointer is an indirect way of referring to something, hence the term indirection.

OK so those are the things you cannot do with a void *, so what good is a void *?

They are generally used to pass any value into some function. Usually such function will at some point cause to be called another part of your own code (usually in another function) and pass you back the void * you provided. You can then cast it back to a pointer of the original type; the type it was before being converted and passed around as a pointer to void.

This sort of behaviour is much more common in C code and interfaces with C calling conventions than in C++. This is because C has a lot less abstraction features than C++ - in fact C is often referred to as a portable assembler, and one of the main aims of C++ was (and is) to be a better C.

One of the problems with using void * is that as soon as you convert something to a void * all information about what type of thing the pointer points at is lost. There is absolutely _no_ guarantee that the pointer type the void * is cast back to matches that from which it was created.

Consider a case in which the place where the void * is created is in one piece of code (e.g. some source code file) and where it is converted back to something usable is in some other place (e.g. some other source code file). Now consider that originally you pass a pointer to a short int as this seems all that is required. The receiver of the void * assumes it originally points to a short int, and casts it back to a short int *, all is well, sender and receiver are using matching types.

Later on you - or maybe someone else entirely - find that more detail is required - maybe a larger integer is needed, a long int, or maybe a floating point value is required. The first source file is modified to pass a long int pointer or double pointer, as a void *. But you forget to, or the other person does not realise they need to, update the other end.  The receiver _still_ thinks it has a void * that really points to a short int, and converts the void * as before. Now this may not be immediately obviously too bad for the long int case - other than large values probably get truncated as not all the bytes in the value are used, so the problem may go unnoticed for some time. However suppose the passed in value really pointed to a floating point type. The format of floating point types and integers are totally different, so treating some of the bits of a float like a short int is much more likely to yield rubbish!

Here is a short code example:

   typedef void (*CallBackFuncPtrType)(void* pData);

   void RegisterCallBackFunction
   ( CallBackFuncPtrType fn
    , void * pCallbackData
   )
   {
     fn(pCallbackData);
   }

   void Receiver( void * pData )
   {
       short int * pShortData( reinterpret_cast<short int*>(pData) );

       std::cout << *pShortData << std::endl;
   }

   void SendShortData()
   {
       short int data(12345);
       std::cout << "SendShortData: ";
       RegisterCallBackFunction( Receiver,  &data );
   }

   void SendLongData()
   {
       long int data( 1234567890L );
       std::cout << "SendLongData: ";
       RegisterCallBackFunction( Receiver,  &data );
   }

   void SendDoubleData()
   {
       double data( 1234567890 );
       std::cout << "SendDoubleData: ";
       RegisterCallBackFunction( Receiver,  &data );
   }

   int main()
   {
       SendShortData();
       SendLongData();
       SendDoubleData();
   }

First I define a type alias for the function pointer type for the callback function I will be using. This is a pointer to functions of type: void fn(void*). That is functions taking a void * pointer parameter and returning nothing. Yes it is nasty, but that is the declaration syntax C++ inherited from C. The typedef gets it out the way - after that I can use the alias name instead of the whole nasty shooting match, which I do in the signature of the RegisterCallBackFunction function. This takes a pointer to the previously mentioned function type and a void * pointer to some data to pass on to this function. In this simple example no actual registration takes place. Instead the function passed by pointer is called and is passed the void * pData parameter that was also passed to RegisterCallBackFunction.

RegisterCallBackFunction would be the C style API we are trying to use. A real example would be various C API (application programming interface) for starting a new thread of execution in which it is usual to pass a function the new thread should execute and some data to be passed to this function as a void*. Other examples include notifications of various other system events, are quite easy to come by - for example the Win32 API (a C API) synchronisation functions has a function RegisterWaitForSingleObject, with the following signature:

   BOOL WINAPI RegisterWaitForSingleObject
   (
       PHANDLE phNewWaitObject,
       HANDLE hObject,
       WAITORTIMERCALLBACK Callback,
       PVOID Context,
       ULONG dwMilliseconds,
       ULONG dwFlags
   );

Note the 4th parameter, Context. Its type is PVOID, which is Win32 speak for void*. The description of this parameter is:

   "Context
   [in] Single value that is passed to the callback function."

(See the MSDN library for more information on RegisterWaitForSingleObject and all other Win32 API and other Microsoft development documentation, it can be found online at http://msdn2.microsoft.com/).

And finally the C library itself - check out the bsearch function (performs binary searches of arrays of data for key values) in <stdlib.h>

   void * bsearch( const void * key
         , const void * base
         , size_t n
         , size_t size
         , int (*cmp)(void * keyval, void * dataum)
         );

Here the last parameter (cmp) is a pointer to function taking two void * values pointing to the key value being searched for and the current value of the binary chop search datum point. The intention is that cmp casts its parameters to pointers of the actual types being used, compare than and return 0 (equal), a negative value if the key value is less than the datum value or positive if the datum value is less than the key value. The qsort function (also declared in stdlib.h) uses a similar technique.

In C++ we can use the standard library's generic containers and algorithms which rely on compiler time template techniques to store and process collections of various types (not different types in the same collection unless they are collections of pointers to some base type).

OK, so after the definition of the spoof API function we are trying to use comes the callback function itself, which I called Receiver (as it is the receiver of the void * pointer to data). The callback function assumes the void * data it is passed points to a short integer, and casts it back to one using the C++ reinterpret_cast. This cast has implementation defined behaviour but generally leaves the bits of a pointer unchanged and just changes the type of the pointer. It is therefore quite dangerous and should be avoided in main code, and relegated to such uses as here where we are forced to lose type information through a (C-style) interface, library, framework etc.
Finally we use the supposed short data by printing its value out on the console.

After the callback function are three functions that 'register ' the Receiver callback function and pass it some data. These functions are named of the form SendTypeData where Type is the type of the data that is sent to the callback function. They are prefixed Send as they send data values via void * pointer to the Receiver callback function.

SendShortData sends a short int value as data to the Receiver callback - as Receiver is expecting.

SendLongData sends a long int value as data to the Receiver callback - which the Receiver function is not expecting.

SendDoubleData sends a double floating point value as data to the Receiver callback - which the Receiver function is not expecting.

Finally there is a main function that calls the three Send.. functions. The results of the program when build, again, using Visual C++ 2005, are:

   SendShortData: 12345
   SendLongData: 722
   SendDoubleData: 0

As you can see only the (correct) SendShortData data is extracted correctly by the Receiver function. The other two can be make to work if we use modified versions of the Receiver function as their callback functions that cast the pData pointer to a pointer to the correct type:

   void ReceiveLong( void * pData )
   {
       long int * pLongData( reinterpret_cast<long int*>(pData) );

       std::cout << *pLongData << std::endl;
   }

   void ReceiveDouble( void * pData )
   {
       double * pDoubleData( reinterpret_cast<double*>(pData) );

       std::cout << *pDoubleData << std::endl;
   }

   void SendLongData()
   {
       long int data( 1234567890L );
       std::cout << "SendLongData: ";
       RegisterCallBackFunction( ReceiveLong,  &data );
   }

   void SendDoubleData()
   {
       double data( 1234567890 );
       std::cout << "SendDoubleData: ";
       RegisterCallBackFunction( ReceiveDouble,  &data );
   }

With the rest of the code left as before. Now if we re-build and execute the code we get:

   SendShortData: 12345
   SendLongData: 1234567890
   SendDoubleData: 1.23457e+009

Which is more reasonable. This version does demonstrate that we can use different callback functions with the same registration (or equivalent) function, each having different requirements for the (context) data they wish passed on to the callback function.

Now I am going to re-iterate that void * is _dangerous_ because it loses type information, and can lead to mistakes and errors, which may only show up later in the lifetime of a product (or hopefully during testing). C++ has quite strong type checking for a reason - to help prevent certain types of programmer errors! The use of casts to convert between types usually indicates an area where things are not quite as we would like. They are necessary in real code, but not frequently. Any C++ program design that relies on casting as part of the design is a bad design. Casting is there for those really awkward places - often when interfacing with old code, C code and C interface (as above), and in some cases very low level code (e.g. we may wish to treat an object of some type as just a bunch of bytes for transmission over some medium, and then need to convert them back again at the other end).

The most obvious way around having to use void* to pass data around in C++ is to use classes - specifically a simple class hierarchy. Using a polymorphic base class (i.e. one having at least one virtual member function). In fact such a type could be abstract - i.e. at least one of the virtual member functions is pure [has no (polymorphic) implementation].  Derived classes override and implement the virtual functions. Derived classes can have state that is included for use by its associated virtual member function implementations.

Taking the previous example as the basis for an example. We define a base class that has one pure virtual member function. This function takes the place of the callback function in the previous example. The RegisterCallBackFunction will now take a reference (or pointer) to an instance of this type. In fact it will be a reference or pointer to a type derived from this base as we cannot instantiate an abstract base class (the pure virtual function(s) means that any such object would be incomplete as some of the operations on the object have no (polymorphic) implementation). This object takes the place of the data previously passed by void * to RegisterCallBackFunction. This base class look as follows:

   class CallbackBase
   {
   public:
         virtual void Receive() = 0; // pure virtual

         virtual ~CallbackBase(); // class is polymorphic
         // - destructor should be virtual
   };

   CallbackBase::~CallbackBase()
   {
   }

The modified RegisterCallBackFunction takes a reference to a CallbackBase object [including objects of types (publically) derived from CallbackBase; remember derivation is an is-a association so an object of a class (publically) derived from CallbackBase is a CallbackBase]. It looks as follows:

   void RegisterCallBackFunction( CallbackBase & cbObj )
   {
       cbObj.Receive();
   }

Notice the lack of pointers to functions and pointers to void. I could have chosen to pass the CallbackBase, in which case RegisterCallBackFunction would minimally look as follows:  

   void RegisterCallBackFunction( CallbackBase * cbObj )
   {
       cbObj->Receive();
   }

However, as pointers can be null, there really would have to be checking for cbObj being null added:

   void RegisterCallBackFunction( CallbackBase * cbObj )
   {
       if ( cbObj )
       {
         cbObj->Receive();
       }
       else
       {
         throw std::invalid_argument
         ("Null CallbackBase pointer argument.");
       }
   }

The updated version of the Receiver, ReceiveLong and ReceiveDouble callback functions are implemented as overridden implementations of the CallbackBase::Receive pure virtual function in classes derived from CallbackBase. The data that they were passed are instance data members of these classes and objects of these classes are constructed from a single argument of the appropriate data type (short, long or double). Here is the implementation of the short int data version:

   class CallbackShortData : public CallbackBase
   {
       short int   iData;

   public:
       explicit CallbackShortData( short int data ) : iData(data) {}

       virtual void Receive();
   };

   void CallbackShortData::Receive()
   {
       std::cout << iData << std::endl;
   }

Notice how simple the void CallbackShortData::Receive() is - no reinterpret_cast, no dereferencing of pointers. Thus the data is carried with the object on which we make the callback. What this data is, how it is initialised and used is used is up to the class. Once an object of the class is created all its data should be properly initialised (if the class is written well). When sending such an object through the callback mechanism we pass it as a reference or pointer to the base type, which defines the callback function(s) expected. As we are using polymorphism when the base object reference (or pointer) is used to call the Receive callback member function the derived implementation gets called in the context of the object of derived type that was passed in in, having all the state of this object available. Hence this mechanism preserves the data type-information from the point of registration to the point to call back.

The long and double classes are similar, other than their iData member types are long and double respectfully and they are initialised with values of these types:

   class CallbackLongData : public CallbackBase
   {
       long int   iData;

   public:
       explicit CallbackLongData( long int data ) : iData(data) {}

       virtual void Receive();
   };

   void CallbackLongData::Receive()
   {
       std::cout << iData << std::endl;
   }

   class CallbackDoubleData : public CallbackBase
   {
       double   iData;

   public:
       explicit CallbackDoubleData( double data ) : iData(data) {}

       virtual void Receive();
   };

   void CallbackDoubleData::Receive()
   {
       std::cout << iData << std::endl;
   }

The Send functions now create objects of the CallbackXXX types, initialised with the same values used previously. They pass these objects to the (updated) RegisterCallBackFunction function:

   void SendShortData()
   {
       CallbackShortData data(12345);
       std::cout << "SendShortData: ";
       RegisterCallBackFunction( data );
   }

   void SendLongData()
   {
       CallbackLongData data( 1234567890L );
       std::cout << "SendLongData: ";
       RegisterCallBackFunction( data );
   }

   void SendDoubleData()
   {
       CallbackDoubleData data( 1234567890 );
       std::cout << "SendDoubleData: ";
       RegisterCallBackFunction( data );
   }

The main function remains the same.

Building and executing the C++ version of the code produces:

   SendShortData: 12345
   SendLongData: 1234567890
   SendDoubleData: 1.23457e+009

As per the previous corrected C-style version using void* etc..

Should anything go wrong in the implementation of the C++ version then it is more likely that some sort of warning would be produced at least (although compiler warnings are specific to compilers and not mandated by the C++ standard, most tend to warn if you cause a value to be converted in a way that data could be lost). If the types concerned were really incompatible then there would be compiler errors. Such warnings and errors will of course generally point you at some code - hopefully code that hints heavily at the problem!

Such mistakes should also be less likely as the code that defines the types of the context state data members and the types used in the initialisation of those members passed to constructors would be close to each other in the same class definition, in the same file.

So in brief

- void * allows passing around data of any type by pointer
- it is needed much more in C than in C++
- it is unsafe as it loses type information and therefore can lead to programmer errors that are impossible for the compiler to catch. Catching such error probably has to be deferred to run time testing - much more tediously.
- C++ has other mechanisms that can be used instead more safely
- refrain from using void * in C++ code as much as possible
- if you do have to use it wrap it up in a safer C++-style interfaces as soon and as much as possible. Keep the use of void * localised to those places where the code uses the offending C interface!
- avoid using the C library functions such as qsort and bsearch. C++ has a much broader range of container types and algorithms to work on such data collections based on safer static compile time template based generic techniques.

Hope this has given you enough on void pointers.  

C++

All Answers


Answers by Expert:


Ask Experts

Volunteer


Ralph McArdell

Expertise

I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.

Experience

My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.

Education/Credentials

©2016 About.com. All rights reserved.