You are here:

C++/Is there such thing as 'virtual class' or 'abstract function' ?

Advertisement


Question
Hi,

Is there such thing as 'virtual class' or 'abstract function' ?

Thanks,  

Answer
Hmm, out there in the universe in general there quite possibly are such concepts.

However with specific regard to C++, in which category you are asking this question, no, although see the bit about virtual base classes below!

Classes can be abstract (and therefore are also base classes, see below). Member functions of classes can be marked virtual.

A class can have member functions that are virtual which means they can be overridden and redefined by sub-classes and allows run time polymorphism. Dr. Bjarne Stroustrup implies in the "The Design and Evolution of C++" that the keyword "virtual" was borrowed from a language called Simula and that virtual is the 'Simula and C++ term for "may be redefined later in a class derived from this one"'.

Now this can be taken to the extreme in that no definition may be provided in the class that initially declares a virtual function - it is a pure virtual function:

   class A
   {
       virtual do_something() = 0; // = 0 => pure virtual function
       virtual ~A() {}
   };

A consequence of doing this is that you cannot create objects of class A as doing so and calling do_something on the object would result in a call to a non-existent function. You have to derive from A and override do_something providing a function definition (i.e a function implementation) for it. Because you cannot create objects of class A, class A is known as an abstract class. In fact because it has to be used as a base class (that is other classes have to be derived from it to be useful) the proper term is abstract base class (or ABC).

The above is not quite correct. In fact you can provide a definition for a pure virtual function. However this function can only be called directly using compile time (i.e. static) function call despatch, not through the run time polymorphic dynamic despatch mechanism. Anytime you have a pointer or reference to an object of an ABC (abstract base class) like class A and you call a pure virtual function on it that has not been overridden the compiler will probably have to issue a dynamic call using the dynamic call despatch mechanism, which still has a pure (i.e. non-existent) definition for the function and Bang! Your program would crash, or possibly terminate with some message about a pure virtual function being called. For this reason even if you do provide a function definition for a pure virtual member function the class is still abstract and you still cannot create objects of the class or any class derived from it that has not provided definitions for all pure virtual functions.

OK, so you are probably a little confused after that, so I have provided some more information on the calling mechanisms used by C++ below. It is not really directly relevant to your question so only read it if you are interested, and if you do then please ask followup questions if you require clarification as it is not an easy thing to describe (so sorry if it is not 100% clear!).

Before doing so I shall mention briefly that a class used as a base class to some derived class can be specified as a virtual base class. When this occurs a further class deriving from two base classes that both derive from a common virtual base class only get one copy of the common virtual base class:

  class Base {};
  class Middle1 : public Base {};
  class Middle2 : public Base {};
  class Derived : public Middle2 : public Middle1 {};

The above class hierarchy does not use virtual base classes and Derived contains two Base sub-objects one from Middle1 and the other from Middle2. Often this is not at all what is required. What is required is that Derived contain one base sub-object, one Middle1 sub-object and one Middle2 sub-object. This is where virtual base classes come in:

  class Base {};
  class Middle1 : virtual public Base {};
  class Middle2 : virtual public Base {};
  class Derived : public Middle2 : public Middle1 {};

Note that the virtual-ality is only applied for a specific use of the Base type when it is being used as a base class. It does not affect Base itself in other circumstances. For more information see for example Chapter 15 of "The C++ Programming Language 3rd edition" by Bjarne Stroustrup.

You should also check out the C++ FAQ lite at http://www.parashift.com/c++-faq-lite/, for virtual function, ABCs and virtual base classes and loads of other common C++ topics.

Now onto some explanation of function calling in C++. Note that I have made simplifications so as to try to get the main points across.

In general in C++ function calls, whether they are to member functions or free functions (i.e. non member functions) a made using a static despatch mechanism. That is, they are wired up at compile time. It works something like this:

A function implementation, or definition if you like, has an address. The compiler compiles the contents of the function and allocates storage for the code it contains.

If you call the function the compiler makes a direct reference to the address of the function in the code at the call site:

     ; func();
     call 0x00ab0cf0

If you ask for an assembler listing from a compiler (optional output controlled with command line options) and are lucky the listing will use a symbol, here is a real example from a listing produced by MSVC++ 8.0 (the 2005 edition):

  call   ?FillArray@@YAIPAHI@Z         ; FillArray

You can then look for further references to ?FillArray@@YAIPAHI@Z and find its assembler code definition, plus a load of other low level stuff, much of which is probably for the benefit of the linker.

Which brings up the problem of functions called that are defined in other modules. In these cases a similar call instruction is generated, but the symbol used is marked as an external function and it is up to the linker to fill in the details during linking.

In either case the result is the same - the addresses of the functions called in call instructions are fixed by the time the executable has been created. They are static in that when the program is executed these values will not change.

So what happens when you use the dynamic, runtime virtual despatch mechanism by using virtual member functions? Well basically the calls are made indirectly. This is normally (that is practically always) achieved by storing pointers to virtual member functions in a table, called the vtable (v for virtual), for which there is one per class (note: _not_ per object) that requires one. A class requires a vtable if it has at least one virtual member function. However each object of such classes do require access to the class's vtable and so a pointer to it is placed as a hidden data member to each object.

The compiler maintains the entries in this table. As virtual functions are overridden the function pointers to them are updated in the class's vtable. Setting these pointers occurs during construction. Base class parts of an object are initialised before derived class parts so the vtable first gets filled with pointers to base class virtual member function definitions. Later on if a derived class overrides a virtual function the constructor for the derived class part will replace the pointer to the base class function definition with a pointer to the derived class function definition. Note that this is why you cannot call virtual functions during construction and have them call the derived implementation - those parts have not been initialised yet so the vtable contains the base class virtual function definition pointers!

Now you may be able to guess why pure virtual functions are initialised using =0. Yep, the pointer to such functions is notionally a null pointer (or zero), and so a null pointer is placed into the vtable for pure virtual functions and trying to make a call to a function through a null pointer is not a good idea! In fact some compilers place a pointer to a piece of code that causes a pure virtual function call error to be raised instead of a real null pointer.

Here is an example:

First the C++ of a simple virtual function pair of classes:

   struct ABC
   {
       virtual void vfunction() = 0;
       virtual ~ABC() {}
   };

   struct Derived : ABC
   {
       virtual void vfunction();
   };

   void Derived::vfunction()
   {
       std::cout << "Derived::vfunction() called\n";
   }

Next a function that calls vfunction dynamically on a reference to ABC:

   void Call_VFunction( ABC & object_ref )
   {
       object.vfunction();
   }

And finally some code that calls this function:

   int main()
   {
       Derived d;        // Must use a Derived as ABC is abstract
       Call_VFunction(d);
   }

Now let us look at the dynamic call to Derived::vfunction though the object_ref ABC refernce argument to Call_VFunction.

Here is the simpified assembler:

  mov   eax, _object_ref$[ebp]
  mov   edx, [eax]
  mov   esi, esp
  mov   ecx, _object_ref$[ebp]
  mov   eax, [edx]
  call   eax

Quite complex isn't it?
Some of the lines concern setting up the stack and this pointer for the call:

  mov   esi, esp
  mov   ecx, _object_ref$[ebp]

So we can remove them as they are not central to the discussion. This leaves:

  mov   eax, _object_ref$[ebp]
  mov   edx, [eax]
  mov   eax, [edx]
  call   eax

The sequence goes as follows:

  mov   eax, _object_ref$[ebp]

Move the value of the object_ref parameter into register eax. This is the address of the Derived object d passed from main. Effectively eax holds a pointer to d.

  mov   edx, [eax]

Move into register edx the contents at the addess in register eax, which is the first data item in d. This is the pointer to the vtable. Register edx now holds a pointer to the Derived class's vtable.

  mov   eax, [edx]

Move into eax (again!) the value at the address in edx. This is the address of the first entry in the Derived class's vtable. Register eax now holds a pointer to the Derived::vfunction function.

  call   eax

Finally! Call the function at the address in eax.

Now the same piece of code will work with another class derived from ABC, say Derived2:

   struct Derived2 : ABC
   {
       virtual void vfunction();
   };

   void Derived2::vfunction()
   {
       std::cout << "Derived2::vfunction() called\n";
   }

All that happens is that the following are done:

  mov   eax, _object_ref$[ebp]; load pointer to Derived2 object
  mov   edx, [eax]          ; load pointer to Derived2 vtable
  mov   eax, [edx]          ; load pointer to Derived2::vfunction
  call   eax          ; Call Derived2::vfunction

The same code can call different functions depending on the exact type of the object it is passed. The code does not know at compile time which function it will call, that is only determined dynamically at run time.

It does of course rely on the vtables for all the related types being laid out in a compatible and consistent manner and, for that matter, that the position of the vtable pointer for objects is maintained consistently.

Hope this has been informative. As I said before please ask follow-ups if you require further clarification.  

C++

All Answers


Answers by Expert:


Ask Experts

Volunteer


Ralph McArdell

Expertise

I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.

Experience

My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.

Education/Credentials

©2016 About.com. All rights reserved.