You are here:

C++/dynamic creation of template class objects

Advertisement


Question
QUESTION: Hello,

I need to read a delimited file and then extract fields and other information that i want to store into a template class object. Every field will go to a template class object instance, with a type that can be different for each field (string,char, double etcc). Just because previously I don´t know the number of fields , i don´t know how many objects i have to create and overall, how to refer them in my program because i can´t name this objects dynamically.

I understand that with a "simple" types simply using maps o vector will solve the problem, but here my problem is that i have to use template classes . At the time I read the file I need to create an object for each field and then write the to a binary file. Other functions of the program require to manage this kind of situations.

Thank you in advance,

regards,
Daniel.

ANSWER: Hello Daniel.

It would be helpful if you give me with a sample of the delimited input file. Also it would be good to see what you have done so far so I can see what you need help with.

Generally, you need to parse the file, and somewhere in your code you need to dynamically create new instances of your classes using the new operator. It doesn't matter that they are template classes. The pointers created can be stored somewhere, perhaps on a vector or map. The objects will be referred to by their pointers. For that reason, you will need logic to decide which object you want to refer to. For example, if all the objects are stored on a map, and you want a particular object, you will have to find its pointer on the map using some search criteria.

How you store your objects will depend on how they are to be used. Please explain that part to me.

The last part of your question is a bit unclear. Do you need to store the template objects into a binary file ?

Let me know the details, and I'll be happy to help more.

Best regards
Zlatko

---------- FOLLOW-UP ----------

QUESTION: First of all , thank  you very much for the answer. My basic template class to store the info is something like this


 template <typename TDATA,typename TREG>
class datos{          
public:
  TDATA dato;
  TREG tipo;
     /*initialization list*/
datos(
 TDATA   vdato     = TDATA(),
 TREG tipo_dato = TREG()
 ):
 dato( vdato ),
 tipo( tipo_dato )
 { }
  };




Then in another superior class, I have a vector that contains (among other things) elements of type

vector <datos<TDATA,TREG> >

This, basically is designed to get data itself and its type, being the type a predefined one, depending, but knowing this types of course. Of corse data can be integer, double, string .. whatever.

Then , a function will call the reading of the file ,being one of this arguments the field name follow up by its type, separated by a - character . for example, and making it simple

void function read (string argument,string filename,char delim)
{}

then the call can be  read(field1-c|field2-f,/home/data/example.txt,#);

the file would be like this

A#4567.98
B#3.9087

In the function call , from the arguments, I can know the type of the data itself because c is  for char, f is for float etc. I control that in the body of the function, and the character that goes behind the - character is both to store and to know the type of the data.

Then I have to store that data into one vector as described above, but I don´t know the type at compile time, because that argumenta are given by the user.

The way you propse if I understood, is to dynamically assign with new¨(), but I don´t see how to point to a template class not defining its arguments. Is something to do with an abstract definition of the class and pointing to it?

The easy way I was thinking too is to store all the info in string vectors and then with istream convert them to the right type, but this will consume more memory. This link with the last part.

The last part , that binary file one, is because each field is going to be stored into a binary file. The problem is that I have to work with Hugh amounts of data, sometimes bigger than the available memory, so I have to "flush" to a file when certain amount of memory is reached. Anyway thats another issue and in fact I´m working in a "serialization" method that at the time works for my purposes.

Thank you again Zlatko, and sorry if the text is not very clear; English is not my language

Best regards,
Daniel

ANSWER: Hello Daniel.

Your English is fine. It is much better than my Spanish. Early in my career, I spent many months in Guatemala, and Bolivia. That was many years ago. I worked very hard to learn the language but now I have forgotten everything.

I will try to help you with your program, but I need to know more about how the datos type will be used and what your program is meant to do. For now I can offer you only general ideas.

There is a problem with the idea of declaring the vector in this way
vector <datos<TDATA,TREG> > v;
When declaring the vector, you must specify the type that the vector will be holding. Here you are saying that it is holding a type of datos, but the datos is not fully specified because TDATA and TREG are not actual types. They are just template parameters.

One option is to have multiple vectors, one for each actual datos type. For example
vector <datos<int, int> > vA;
vector <datos<char, float> > vB;
I have no idea about what your actual template parameters will be.

Another option is to have one vector but storing pointers to a superclass of the datos type. Something like this:

#include <vector>
using namespace std;


class SuperClass
{
public:
};

template<typename TDATA, typename TREG> class Datos : public SuperClass
{
public:
};

vector <SuperClass*> V;

int main(void)
{
   V.push_back(new Datos<int, int>(123));
   return 0;
}

It depends much on how you will be using your Datos class. It would be good if you can write your program so that it doesn't need to know exactly what type of Datos it is dealing with. It is usually the sign of a good design if your program accesses all the Datos instances through an interface and with polymorphism instead of having to worry about what exact type of Datos is being used. For example, if your program has many functions, each with lines like this pseudo-code

if (using datos type 1)
{
   do something
}
else if (using datos type 2)
{
   do something else
}


then your program is probably not designed in an object oriented way.

All this discussion may be irrelevant, because I don't really know what you are doing with datos.

Anyway, perhaps I can help you a bit with the read function. It seems that the first parameter is a character determining the type to be created. The creation can look something like this:


vector <SuperClass*> V;
void readFile(char type, string filename, char delim)
{
   switch(type)
   {
   case 'f':
       V.push_back(new Datos<float, ??>(??));
       break;

   case 'c':
       V.push_back(new Datos<char, ??>(??));
       break;

   }
}

I'm not sure what the TREG template parameter is for, and I'm not sure where the construction data is coming from. It looks like TREG is for specifying the type, (tipo means type?), but the type is actually already specified with TDATA. Perhaps you need just 1 template parameter.

One more question for you. In the file
A#4567.98
B#3.9087
what do the A and B mean?

Maybe that will get you started. If it is more convenient for you to communicate with me by e mail, you can reach me at
zlatko doc c dot help at gmail dot com
Sometimes people like to send me attachments to better describe what they are working on.

Best regards
Zlatko


---------- FOLLOW-UP ----------

QUESTION: Hola Zlatko,

thanks again and about the language , English is more easy to retain because actually is everywhere , specially when you're working in technology or investigation areas because the best books ,
documentation and even in this case the own programming languages are written in English. For example I was studying french for three years over 13 years ago and now I can´t make a single sentence je,je.

About the program: I'll start describing the overall goal of the program. It's a project of a statistical data analysis program. It means that the program have to read data (initially from some delimited text file, later from other sources)
and store the data into an own  binary file format. After that , the program can be able to do basic mathematical operation with numbers, vectors and matrix and finally , based in those basic operations be able to
implement several statistic algorithms and methods. It´s not a professional project but a personal one. I'm statistician and in my job I work with professional software that does this kind of stuff, but this is a personal occupation that helps me to
learn c++ programming and the core of many statistical and mathematical methods and algorithms.

I'm now in the first phase , and I have designed the classes necessary to store the data. The data have to be in column format, ie, each column is a variable/field, and each line is a data, as usual in database implementations.
To store this, and for efficiency in later calculations, I've thought to store each field in a separate physical file. After that there´s a class (also physically saved in a little file) that represent a table which links the individual
field files to represent a "virtual table".

So we start in a file that can be like the other question one, even I'll add one more field to illustrate better the example :

A#4567.98#ELEMENT A
B#3.9087#ELEMENT B

where the A and the B are data , of type char, 4567.98 is data of type float, and "ELEMENT A" is data of type string.Let's name those fields as
CODE,QUANTITY and DESCRIPTION and lets suppose they are the code of a product, its value and its description. The important thing is that the program
have to work with files of this type , but can contain 2, 10, or 100 fields and 2 , 100 or 10 million rows. Only the user in the final interface
will say what is the file and which are the fields and types that are in it.

The basic class I've made for the data is the class datos

template <typename TDATA,typename TREG>
class datos{          
public:
 TDATA dato;
 TREG tipo;
    /*initialization list*/
datos(
TDATA   vdato     = TDATA(),
TREG tipo_dato = TREG()
):
dato( vdato ),
tipo( tipo_dato )
{ }
 };

The type TREG is initially designed to implement the type if the data is numeric (float, double etc) giving it a char type and
values like c,f,i, or d, but if the data is a string, then this parameter can store the length of the string (it would be an int )
,and if the "virtual type" is a date, this field  would store another identificator. So this parameter more than a type is a kind of "joker" in my data definition.
Anyway the overall question wouldn´t change if there´s only one parameter in the template.

Above the class datos there´s a class called campo (field in Spanish as you sure know je,je) that is implemented like this:

 template <typename TDATA,typename TREG>
  class campo{
        public:
        string  nombre; //this is the name of the field
        char tipo; //type of the data, given by the user. This will allow to know what type of data we're working with(char will be a c, string will be a s, date will be a d etc..)
        string fichero_campo;//this is the name of the binary file that store (or will store) the class
        vector <datos<TDATA,TREG> > vector_dato; //this is a vector that store class dato, defined above
        campo(){};constructor
        ~campo(){};
         void escritura_bin(char,char*);//method to implement physical binary serialization of the class
        void escritura_txt(char*);//method to implement physical text serialization of the data (export to text file)
        campo<TDATA,TREG>%26 campo<TDATA,TREG>::lectura_bin(char*);//method to read a binary file once I've defined a campo object. There's a general reading function outside the class that reads a file an d create an object class instance also.
        //overload of mathematical operations , only shown 5 or 6, there's more than  50 distinct methods for a campo class.
        campo<TDATA,TREG>%26 campo<TDATA,TREG>::operator+=(const campo<TDATA,TREG>%26);//suma +=
        campo<TDATA,TREG>%26 campo<TDATA,TREG>::operator+=(const double%26);//suma += cte.
        campo<TDATA,TREG>%26 campo<TDATA,TREG>::operator*=(const campo<TDATA,TREG>%26);//producto *=
        campo<TDATA,TREG>%26 campo<TDATA,TREG>::operator*=(const double%26);//producto*= cte.
         campo<TDATA,TREG>%26 campo<TDATA,TREG>::operator/=(const campo<TDATA,TREG>%26);//división *=
        campo<TDATA,TREG>%26 campo<TDATA,TREG>::operator/=(const double%26);//división /= cte.
        };
        
Now, if we look back to the file example, I've to write a function that read this file through a function like this:

void function read (string argument,string filename,char delim)
{}

then the call can be  read(CODE-c|QUANTITY-f,|DESCRIPTION-s,/home/data/example.txt,#);

the first argument has to be constructed like this, with the - character to separate name and type, and the | character to separate each field.
This is a syntax rule of the program. always has to be so or it will give us an error.
the types are:

c for char
f for float
d for double
i for int
d for date (an special type of data in my program)
s for string

Then I've to start reading the file and assign a campo class and of course a datos class to each data/field.

I know I have to call the class as for example datos<float,char> my_data or campo<float,char> my_field. As I know the type of data and the name of the field I work with (because its in the argument of the function), I can declare for the first field an object like
campo<char,char> my_code. The same for the other two fields. And the same for the data class datos datos<char,char> my_data_code for example.

The problem obviously is that i don´t know at compile time those things and i have to crate instances of the classes dynamically. So reading your answer I have a more clear idea of what to do. The idea is to create an superclass (An abstract class, isn´t it?)
that can be pointed and then

class SuperClass
{
public:
};

template <typename TDATA,typename TREG> class datos: public SuperClass
{          
public:
 TDATA dato;
 TREG tipo;
    /*initialization list*/
datos(
TDATA   vdato     = TDATA(),
TREG tipo_dato = TREG()
):
dato( vdato ),
tipo( tipo_dato )
{ }
 };

then define vectors like this: vector <SuperClass*> V;

and finally when reading play with types and vector
void readFile(char type, string filename, char delim)
{
  switch(type)
  {
  case 'f':
      V.push_back(new Datos<float, char>(value,"f"));
      break;

  case 'c':
      V.push_back(new Datos<char, char>(value,"c"));
      break;

  }
}

But now I see that I have to manage the situation with the class campo<TDATA,TREG> , doing the same thing with a superclass of this one,
or maybe have I to create only a superclass for campo<TDATA,TREG> and then use the new operator only for that class, being  datos<TDATA,TREG> parameters known?
I mean that as the parameters are the same, with each new instance of  campo<TDATA,TREG> is easy as you illustrate with switches to create a vector that stores certain kind of data.

Well , I'm, truly  grateful with your help . Hope I have been clear but not too bored. You already have helped a lot,

Best regards,
Daniel

Answer
Hello Daniel.

Your project sounds very interesting and I am happy to do what I can to help out. Right now it seems to be in the early ideas stage, so if you don’t mind, I’d like to help by asking more questions.

If I understand correctly, the campos represents a field (meaning a column) of data from the data file, and each row of data will be an element in campos::vector_dato. Then each of the overloaded math operations will be operations on elements of vector_dato . For example operator+= is vector addition. Some of the operations will not apply to dates and strings, and I suppose you will check at runtime for invalid operations. Am I correct so far ?

In the datos class, I believe you need only one template parameter and that is TDATA. I understand your idea of the TREG being a char or an int. If it is a char type, then tipo holds a type and if it is an int, then tipo holds a string length. I think it is a misuse of templates. The type of dato is already specified by TDATA and TDATA can be used at compile time for type checking. The TREG cannot be used at compile time because its real use is to hold a type specified at run time. For now I’ll assume that there is a good use for the tipo field. Perhaps you will need it to disable some operations in campo.  Here is a suggestion for the datos class.

enum Tipo
{
   Char = 1,
   Short,
   Int,
   Double,
   Date,
   String
};

template <typename TDATA>
class datos
{          
public:

   TDATA dato;
   Tipo tipo;

   datos(TDATA vdato, Tipo tipo_dato) : dato(vdato),  tipo(tipo_dato)
   { ; }
};

As I understand it now, the campo will contain the vector of datos, so the campo template parameters will be used to specify the vector_dato type. There is no need for a SuperClass of datos. The following will compile.

template <typename TDATA>
class campo
{
public:
   string nombre;
   Tipo tipo;

   vector< datos <TDATA> > vector_dato;
};

I have some thoughts about your file.

A#4567.98#ELEMENT A
B#3.9087#ELEMENT B

It can contain many fields (meaning columns), perhaps 100, or more.  I suppose the read function
read(string argument, string filename, char delim)
will allow the user to specify which fields are of interest. I assume the user wouldn’t always want all the fields. How would the user specify that he wants the first and the 99’th field? I suppose some user interface will take that information and construct the argument string, but it is more work for you to program a way to parse the argument string. I suggest that instead of an argument string, the function could accept a vector of integers, with the integers specifying which field is desired.  I would also suggest that you add meta-data to your data file to specify the type of each field and the name of each field. You could have a convention like this:
//#
//c#f#s
//Code#Quantity#Description
A#4567.98#ELEMENT A
B#3.9087#ELEMENT B
so that the file becomes more self-describing.  The meta-data contains the delimiter, the types of the fields, and the description of the fields. Your program can read these and present them to the user instead of making the user work to present them to the program.

What will the read function produce?  Will it create a number of campo objects  each containing a vector of data corresponding to the field in the file ? Perhaps, then campo should inherit from some superclass as you suggested in your last note.

Well, that is enough for today. Let me know what you think. If you want help with some of the coding of the read function, I can do that, but let me know the final form of the parameters and outputs of the function. I have some more ideas I want to think about. Specifically, I don’t like needing to check at runtime if an operation is valid or not. It should be something the compiler can detect based on the type. I will need to think about it more.

C++

All Answers


Answers by Expert:


Ask Experts

Volunteer


Zlatko

Expertise

No longer taking questions.

Experience

No longer taking questions.

Education/Credentials
No longer taking questions.

©2012 About.com, a part of The New York Times Company. All rights reserved.