C++/reading specific range in text files
Expert: Ralph McArdell - 9/13/2006
Question
hi
Sir, i am trying to read a specific range in text files ( lets say the number in the line 20 and column 10 ). and i want to store it in integer variable until I make my comparison .
generally , i am in elementary stage , but i tried to write something
#include<iostream>
#include<fstream>
#include<string>
Using namespace std;
Int main()
{
string filelines ;
ifstream file ( “myfile.txt”)
while (! File.eof( ))
{ getline (file,fileline);
Cout<<fileline<<endl;
}
File.close();
System (“pause”);
Return 0;
}
As you can see here , I can read the whole lines into string but I can do nothing with it and I can not specify the position I need in the text files
Thank you very much in advance for your consideration
mahmud
AnswerThis seems like a reasonable thing to do and I have come across such requirements in real life.
Specifying positions in text files based on lines and characters is generally a futile exercise as line lengths vary. What we see as a two dimensional page of text is in fact a single dimensional collection of characters. It just happens that some of those characters modify the formatting when displayed to stop being displayed across the editor and to wrap onto the next line.
To perform random access positioning in multiple dimensions when the underlying data is really single dimensional requires fixed length spans. This is how C and C++ built in arrays work: an array of 10 rows of 20 columns is really a linear arrangement of 10*20 objects in memory. And the location in the array of the object at index [4][5] is given by 20*4 + 5 (location 85) i.e. 20*row+col. In general the formula to find the object location within a C/C++ 2D array is columnLength*row + column.
However if we do not have a fixed size for the number of columns - as is the case with a text file - then the whole scheme falls apart.
Ok so maybe you have fixed line lengths. So answer me this: How many characters in the file make up a new line character? Although for text files you only process one character ('\n') for new line in C and C++ the number of characters actually present in the file is typically 1 or 2 characters and which ones and in which order vary depending on operating system. Typically the combinations are from the set of {'\n','\r'} or to put another way its some combination of linefeed and carriage return. Secondly, when positioning within a file using seekg or seekp does this take into account the character type and whether the file is being processed in text (new line translation) or binary (raw) modes? The answer is yes, probably, but how it does so can be confusing!
In short the easiest and safest way to position to the 20th line in a text file is to start at the beginning (or some other known point) and read and throw away the first 19 lines. For example one way to do this based on your existing code might look like this:
ifstream & SeekToLine(ifstream & strm, unsigned int lineNumber )
{
string line;
unsigned int count(0);
strm.seekg(0, ios::beg);
while ( ! File.eof() && File && count < lineNumber )
{
getline( strm, line );
++count;
}
return strm;
}
The above function takes and returns a reference to an ifstream and also takes an absolute line number (starting at 0). So if you wish to position to line 0 of the file the file is rewound to the beginning (the line strm.seekg(0, ios::beg);) and that is all as count is 0 and lineNumber is 0 and 0 is not less than 0 so the final while loop condition fails and the function returns with the file positioned to read the initial line. Likewise if you position to line 1 the first line (line 0 is read) and count becomes 1 which is no longer less than 1 so the function returns, with the file positioned ready to read line 1.
So the statement:
SeekToLine( file, 19 );
will get your file positioned read to read the 20th line (line 19 counting from 0). In fact it is probably better to check the state of the stream on return from SeekToLine:
if ( SeekToLine( file, 19 ).good() )
{
// get on with it...
}
else
{
// handle eof or bad stream state
}
Well the obvious thing to do in this case is read the whole of the line:
if ( SeekToLine( file, 19 ).good() )
{
getline( file, fileline );
}
// ...
If all goes well fileline will contain the text of the 20th line in the file. You can check there are at least 10 characters in this line using size() or length() operations of std::string (you need to check this if you wish to obtain the 10th character of the line!). You can then access the 10th character using linestring[9].
Of course if you wanted to position to line 20, character 10 you could modify SeekToLine to take a second integer value representing the character position to seek along the line requested. You could do this using for example the get() operation for individual characters. Note that it would most likely be some sort of error if you located a line end while reading up to the requested character on the requested line. You should probably change the name of SeekToLine to something like SeekToLineAndCharacter.
If you meant something else by column 10 such as the 10th value across the line then you can do something like the following (assuming the values are separated by white space):
if ( SeekToLine( file, 19 ).good() )
{
for ( int i(0); i < 9; ++i )
{
string junk_value;
file << junk_value;
}
double the_number_in_10th_column(0.0);
file << the_number_in_10th_column;
}
Here I assume the number you want is a double value. The lines might look like:
1234 aaa 234 0.0 xyz 666 uuu qwerty 1.234 3.145926 rrrr wwwww
If the above were your 20th line then the for loop reads the first 8 values (and throws them away), i.e. the values: 1234 aaa 234 0.0 xyz 666 uuu qwerty 1.234. Then the next value will be read into the double the_number_in_10th_column (i.e. 3.145926).
The examples here are only one way to approach your problem. Hopefully it will get you moving however. Note that I did not compile the code shown here so it may contain errors and typos, for which I apologise.