You are here:

C++/Store Vectors to a file

Advertisement


Question
Hi,

How do I store vectors in a file?

Lets say I've 3 vectors dummy1, dummy2, and dummy3.

Personally, for storing, I'd like to do something like
myFile<< dummy1<<dummy2<<dummy3;

and while reading this from other program, I'd like to do:
myFile>>dummy1>>dummy2>>dummy3;

but it seems that the operators << and >> are not defined for vectors.

What is the proper way of doing it?

Best Regards
Gaurav Kapoor


Answer
I do not know for sure why std::vector (and the C++ standard library container types) does not have operator<< and operator>> but here are some possible reasons off the top of my head:

1/ Would presumably only be useful for vectors of types having these operators.

2/ Would therefore possibly increase requirements on types that can be used as std::vector elements (i.e. they should be usable with operator<< and operator>>). Or should this be an optional requirement?

3/ Exactly what format would be used to write out the vector? Should it be for persistence use, such that the size were output (Needed in your example so that when reading back in you can tell where one set of data ends and another set of data starts), or just for display so that only the data items were output, or something else?

4/ Probably not too useful in the general case. This is particularly true of reading data where the data may not be formatted correctly, so files formats have to be devised to aid in detection of corrupted data. Simple reading and writing requirements can be achieved quite easily using other techniques.

So what other techniques could we use?

[
Note on code presented here:

First all code is for example purposes only and lacks production quality features such as error checking and handling. It was written straight off the top of my head so may contain typos, mistakes or errors, for which I apologise. As always code is best viewed using a fixed spaced font such as Courier.

Second I have used a type T for the type of items in your vectors as you do not say what types you are containing in your question.
]

You could just use a loop:

   for ( std::vector<T>::iterator pos=dummy1.begin()
       ;  pos!=dummy1.end()
       ; ++pos
       )
   {
       myFile << *pos << '\n';
   }

Or you could use std::for_each:

   std::for_each( dummy1.begin(), dummy1.end(), OutputItem );

Where OutputItem is a function:

   void OutputItem( T const & item )
   {
       myFile << item << '\n';
   }

Of course this implies the function has access to myFile which it probably does not. In that case we can use a functor - an object of a class type that implements operator(), and can be initialised with such state information:

   class OutputItem
   {
   public:
       explicit OutputItem( std::ofstream & stream )
         : stream_(&stream)
         {
         }

       void operator()( T const & item )
       {
         *stream_ << item << '\n';
       }

   private:
       std::ofstream * stream_;
   };

We then pass for_each an instance of OutputItem.:

   std::for_each( dummy1.begin(), dummy1.end(), OutputItem(myFile) );

I am using a temporary instance here but you might choose to use a local instance and use it 3 times: once each for writing dummy1, dummy2 and dummy3 (assuming they all hold the same type of data).

However there is an even easier way. Use stream iterator adapters. Here is an example using an output stream iterator:

 std::ostream_iterator<T> writer(myFile, "\n" );

 std::copy( dummy1.begin(), dummy1.end(), writer );

The above writes items in the range dummy1.begin() to dummy1.end() to the stream 'range' starting with writer. This just writes to the stream associated with the output iterator - in this case to myFile. The second argument in the constructor to writer is the delimiter string to use between items, in this case a new line so that the items will appear 1 item per line in the file.

The converse is also possible using an input stream iterator:

 std::istream_iterator<T> reader(myFile);
 std::istream_iterator<T> readerEOF;

 std::copy( reader, readerEOF, dummy1.begin() );

As is often the case input is more complex than output. In this case we need to detect the end of the input sequence. In the case of an input stream iterator this is defined to be the end of the file. We create an input stream iterator that 'points' to the end of file by constructing it using its default constructor, as I do for readerEOF above. We then copy from the current file position of myFile to the end of file into dummy1. Note that if dummy1 were empty (or we wished to append the data to the end of the vector) then we would need to create new elements, which we do using a back inserter:

 std::copy( reader, readerEOF, std::back_inserter(dummy1) );

It is called a back inserter because it inserts new elements at the back of the container (for those container types that support inserting elements to their front there is a front inserter as well).

The problem for you with using an input stream iterator with copy is that it will copy the whole set of data into the same sequence - i.e. into only one vector, and you have three vectors worth of data. In this case using a for loop is probably the simplest approach.

In any case you will have to write information in addition to the raw vector element data. For example you might write the size of each vector (i.e. the number of elements in each vector) before writing out the element data. You read these values first then read that many elements into each vector. Of course if something happens to the file then it all falls apart. If for example the first vector contained 5, the next 3 and the last 2 elements of large value integers then the file might look as so:

5
1223430
4234324
7346588
1233499
9874536
3
123457
124874
098763
2
768754
120986

If the first line is deleted then your code will most likely try to read 1223430 values. It cannot so it will only read up to the end of the file. However the first vector now contains all the values plus the size values for vectors 2 and 3, while vectors 2 and 3 are empty.

Your code should take such things into account. For example you might like to delimit the raw data values with additional tags, such as:

SIZE 5
BEGIN
1223430
4234324
7346588
1233499
9874536
END
SIZE 5
BEGIN
123457
124874
098763
END
SIZE 2
BEGIN
768754
120986
END

This allows you to write a more robust as you can detect when the following occur:

- a keyword is not where you expect it to be which allows detection of:
   - SIZE entry is not where expected
   - size value is not where expected (if you read a keyword instead).
   - data set does not start where expected
   - data set has more items than expected
- possibly when a data set ends prematurely according to the previously read in size value, as you would read an END keyword. This most obviously works for numeric data sets where number forming characters are expected and a keyword string is found which leaves the stream in a failed state (i.e. formatting error), but could probably be made to work for string data so long as there are well defined values for the strings (e.g. maybe @@ is never a legal data value, so you could use @@SIZE, @@BEGIN and @@END for keywords).

This is not perfect but much better than the previous attempt. However the code that handles all this added complexity can be somewhat fiddly, and is not general.
You will need to include <algorithm> for std::for_each and std::copy and <iterator> for the stream iterator types. Note that I am assuming a reasonably modern and standard compliant standard library implementation.

If you wish to know more about these standard library topics I suggest you obtain a good reference such as "The C++ Standard Library A Tutorial and Reference" by Nicolai M. Josuttis.  

C++

All Answers


Answers by Expert:


Ask Experts

Volunteer


Ralph McArdell

Expertise

I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.

Experience

My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.

Education/Credentials

©2016 About.com. All rights reserved.