C++/Dynamic Memory Allocation
I have been reading about dynamic memory allocation and how the new operator is used to allocate memory at the 'runtime'.
But I can't understand how it works, or exactly what it does, for example, I can declare an array in the two following ways:
char* array = new char;
well, in both cases I have to specify how large the array should be in advance, so what good does the dynamic memory management do?
In other words, what confuses me is that when we use the new operator to allocate a fixed block of memory which we also specify how large it should be in advance, so then what is the benefit of using dynamic memory allocation?
I would really appreciate it if you would clarify how dynamic memory allocation works and exatcly what benefits does it provide over the static memory allocation and when should I use dynamic memory allocation?
Sorry if this got too long and thank you for your help.
First we usually just say just "at runtime" not "at the runtime".
The second point is that you have missed the point (<g>). Yes you can say
char * array = new char;
and then, if the creation was successful when you have done with the array you have to destroy the objects and return the memory:
delete  array;
However you can also specify the size by using a variable whose value is only known to the code at runtime:
char * allocate_buffer( size_t buffer_size )
return new char[buffer_size];
which you cannot do using static allocation:
unless buffer_size is a constant:
size_t const buffer_size(1000);
or a pre-processor macro:
#define buffer_size 1000
although you should not use the pre-processor for this sort of thing in C++ as we have constant objects, as shown before. Also if you do have to define pre-processor macros many people think that reserving the use of all upper-case names for them is a good idea to differentiate them from other names in the code:
#define BUFFER_SIZE 1000
The reason for this restriction is that the compiler needs to know the value to ensure it has arranged for the correct amount of storage to be allocated when it compiles your code. In the case of a constant object it has already compiled the definition and as it knows it is constant it can safely assume the value used to initialise the object is the one to use. In the case of a pre-processor macro the pre-processor, which runs before the compiler proper (and is on some systems a separate program), will have already replaced the name with the macro expansion - in my last example this means the compiler would see 1000 rather than BUFFER_SIZE. By the way this is another reason to prefer constants and other full language features rather than pre-processor macros and the like. The pre-processor is dumb - it takes the macros it finds and just replaces them with whatever text the macro-expansion creates. It knows nothing of the context in which these expansions occur - if it creates erroneous or nonsensical code then so be it. However the compiler can only report on the results of the expansion so the names of the actual macros involved get lost. There is a similar problem when using a debugger.
Now consider what happens if you need to assign a larger buffer. Maybe the buffer is a string and the string just got too long - maybe you are appending another long string to it. This is another thing that is only going to happen during runtime. The solution is to allocate the new array to be at least as large as the new string needs:
char * new_array[buffer_size + appended_string_size];
Then to copy the old string and new string data into the new buffer - which I shall leave as an exercise for you! Finally you delete the old buffer and maybe assign the new buffer to the old buffer (remember new returns a pointer to the object(s) so we can do this sort of thing):
delete  array;
array = new_array;
Apart from the previously mentioned aspect of dynamic memory consider a program that reads a file or a string and builds up a tree of various objects to represent the data in the file or string - not so uncommon, take as an example an expression:
X = 1 + 2 * 10
Here we have all sorts of operators and things each of which can be represented by specific types of node in the tree (note: view the following using a fixed space font like Courier or Courier New):
Here we have leaf nodes of two types - literal values such as 1, 2 and 10 and named variables such as X. In between we have operator nodes =, + and *. Each of these operator objects have a left and right side. So the program parses the expression and build up the tree by creating nodes and associating them with other nodes. Which nodes do we need and when (and where) do we need them? In this case the sequence may be like the following:
Right = Node: Leaf: Literal(10)
Left = Node: Leaf: Literal(2)
Right = Node: Operator: *( Left, Right )
Left = Node: Leaf: Literal(1)
Right = Node: Operator: +(Left, Right)
Left = Node: Leaf: Variable("X")
Root = Node: Operator: =(Left, Right)
I have used and re-used the names Left and Right. These then will probably be temporary local variables used only to hold references (i.e. pointers) to the left and right nodes of the node to which they are joined. I have shown some idea of type commonality which would be modelled using inheritance such that Node is the base for all types in the tree, Leaf is an intermediate sub-class of Node for nodes that do not require any children and has sub-types Literal and Variable in our example. All operators do require children and they all have an intermediate Node sub-type of Operator. In our example the specific operator types are *, + and = although they may be called something like Multiply, Add and Assign in a real implementation.
Now the whole point of this exercise is to get you to appreciate when dynamic memory is useful - try doing the above not using new (and at some point delete to destroy the allocated objects). I think you will find you can do it easily and cleanly for a given expression, if you ignore the input expression string and the parsing. But then I say that the user can enter any valid expression for example they my enter something like:
Y = (5 - 4 * 7)/3
Or M = ( 2 * X - 11 ) / (Y + 6)
Assuming you had the additional node types and parsing logic to handle subtract, divide and parentheses you would still need to re-write a program to handle each combination of operators unless you use dynamic memory allocation to allocate the nodes of the expression.
Ok I suppose you could pre-allocate a set of each type of node and design them such that they could be re-used over and over again. But how many of each type should you pre-allocate? And what about the additional overhead of managing this cache of nodes? If you think this is fairly simple consider the case of rendering a web-page, any web-page. What range of node types or similar abstractions would you require? If pre-allocating how many of each type?
In general it is easier and more efficient to use just what you need when you need it - and dynamic memory does for those occasions when you only know all the details of what you want during the execution of the program.
However, the downside is that you have to remember to clean up the resources you allocate at runtime - such as closing file handles and releasing synchronisation locks and of course deleting the objects you dynamically allocate. If you forget then your system will loose resources over time - how much and how fast and how critical this is depends on the application. Such resource leakage will eventually cause a resource request to fail, deadlock the application, or slow the system to a point it is unusable and needs to be restarted. Remember many systems run 24 hours 7 days a week. Such failures are usually just annoying - maybe a web site is down for half an hour every week or so - but for some safety critical systems such as medical devices or air traffic control or fly by wire control systems such errors could prove fatal so in these cases the extra complexity and time spend to work with statically allocated resources would be worth it.
Now I should remind you that new and delete do _not_ just allocate and de-allocation memory for objects - they also construct and destruct these objects - even the built in types such as char and int. This is why you use delete  when deleting arrays (allocated using new ) as it informs the compiler that the pointer passed to delete  points to an array of objects - each of which needs to have its destructor called. If you call just plain delete on an array then only the first object will have its destructor called. This is probably not too terrible for a buffer of char but would be for a buffer of objects that have work to do in their destructors - for example a file-handle wrapper that closes the associated file handle if it is open when the wrapper object is destroyed. An array of such wrappers destroyed using delete would only close the first object's file and not the others. Note that although I said probably not too terrible for a buffer of char the fact is using the wrong form of delete for a pointer to dynamically allocated object of any type has undefined behaviour as far as the C++ standard is concerned.
Finally, have you come across the C++ standard library - it contains many useful types such as std::string and std::vector<> - see http://www.sgi.com/tech/stl/
for information on a section of it. Using such classes can often obviate the need for explicit dynamic memory management as they hide such details from you. It is still there though, under the hood.
I would suggest you also obtain some good C texts:
Effective C++, More Effective C++ and Effective STL by Scott Meyers (probably in that order - maybe swap the second and third if you wish to make use of the STL portion of the C++ standard library).
I do not know of a good beginners C++ book as I came to C++ with considerable development experience so I just dived straight in with
The C++ Programming Language 3rd edition by Bjarne Stroustrup.
It is readable and written by the inventor of C++ but it may not be the best quick reference or easiest book for someone new to programming as well as C++. Many people recommend
Accelerated C++ by Koenig and Moo
For those with some previous experience, but I have not read it. On general programming I found:
The Practice of Programming by Kernighan and Pike
To be a book I really wish was around when I started out. For the C++ standard library I use
The C++ Standard Library A Tutorial and Reference by Nicolai M. Josuttis
For the low down on C++ templates
C++ Templates by Vandevoorde and Josuttis
Is good. Oh and a copy of the C++ standard is a useful addition. Youcan get an electronic copy (in PDF format) for $18 from the ANSI e-store web site at http://webstore.ansi.org/ansidocstore/default.asp
- enter C++ into the search text box. It is not an easy read in any way but can be useful on occasions.
There are more books I could recommend but I hope these will get you started - some you will find more useful at first than others - pretty much in the order I listed them. You could browse the ACCU website for more book reviews: http://www.accu.org/