C++/reading from a file
I'm taking an intro C++ course and am trying to find some usefulness for it. I'm trying to write a program to manipulate a text file and put it into Excel. I know how to do this, but I'm running into a problem when reading in the data. The text document is essentially 7 columns (group#, name, ssn, plan1, plan2, cost, cobra) separated by tabs. I can easily read in the group#, ssn, plan1, plan2, and cost, but I'm having trouble reading the name and cobra columns. The reason for this is that the name "field" contains spaces and the cobra "field" is blank for most rows. Here's two examples of what the text file looks like (I've separated the fields with brackets):
 [Employee A]  [RIDE V01] [ZHE1 ZSV1] [$4.27] [COBRA Dt : 08/01/2010]
 [Employee B]  [RIDE V01] [ZHE1 ZSV1] [$4.27]
How can I get the entire name into one variable? I thought of using the getline function, but that would read the entire row into the variable which I do not want. Also how do I tell the program to store whatever is in the cobra column to that variable, whether the column is blank or has a value? I have no idea how to do that. Once I get those into the variables, the manipulation and output would not be an issue. Thanks.
Hmm, well reading such data can be problematic. However if your fields are always separated by tabs and no field contains tabs only spaces then you could use the form of getline that allows specifying the delimiter character and specify the delimiter as a tab for all but the last field (which would require some care anyway as it is optional). The idea would be something like this:
std::getline(fileInputStream, groupFieldValue, '\t');
std::getline(fileInputStream, nameFieldValue, '\t');
. . .
. . .
. . .
std::getline(fileInputStream, costFieldValue, '\t');
std::getline(fileInputStream, cobraFieldValue); // read up to end of line - which will be blank if no field.
I have left out all error checking for brevity but you should make sure you check the state of the input stream after each I/O operation to make sure it is still in a good state. Refer to a good C++ library reference for details of IOStream state and error conditions - e.g. "The C++ Standard Library a Tutorial and Reference" by Nicolai M. Josuttis.
You will notice that I am using C++ library std::strings rather than C-style zero terminated array of char strings, and using the often overlooked standalone getline functions that read into std::string rather than a (fixed size) char array buffer. ONe reason for this is that you do not need to worry (too much) about the size of std::strings - they will handle re-sizing themselves as required. Another reason is that std::string objects have a lot of member function operations including operations for finding positions of interest within strings and extracting substrings which can be of use if you need to clean up any of the raw field values extracted using std::getline.
Include <string> to use std::string and std::getline.
Again refer to a good C++ library reference for more details on std::string and std::getline.
If some of the field values do contain tabs within them then you have a much trickier task ahead of you. In this case I would suggest reading each record as a whole line into a std::string then using the operations on a std::string to determine where the fields' start and end points are and extract them as substrings into your field value holding objects:
std::string record = std::getline(fileInputStream, record);
std::string::size_type groupEndPos( record.find( '\t') );
std::string groupFieldValue( record.substr(0, groupEndPos) ); // 0 is start character
// groupEndPos is number of characters in substring
// in fact is should be groupEndPos - startPos
// but in this case startPos is 0 so just
// groupEndPos will do.
std::string::size_type nameStartPos( groupEndPos+1 ); // Name field starts after tab between group and name fields
// The position of the tab between the name and ssn fields is one before the first digit character of the ssn field
std::string::size_type nameEndPos( record.find_first_of( "0123456789", nameStartPos)-1 );
std::string nameFieldValue( record.substr(nameStartPos, nameEndPos-nameStartPos) );
. . .
. . .
. . .
Note that again I have elided any error checking for brevity. Note that:
- if a find or find_xxx operation cannot locate a matching character or substring at any position in the string
then the value std::string::npos is returned
- the start position for a field should always be less than or equal to the end position for a field otherwise the
number of character sin a substring calculation will be negative
Finally, please note that the above code snippets are all straight out of my head and have not been compiled so may contain errors or types. If this is the case then I apologise.
Hope this gives you some pointers.