You are here:

C++/Reading Hex string from a stream

Advertisement


Question
Hello,

Could you please point me to a way to read a hex string (from a stream) that was saved by the following code:

ofstream ww("test.txt");
Byte bmsg[40];
// bmsg[0..15] gets binary values here,
for (int i=0; i<16; i++)
{
  // Add leading 0s for the broken STL hex support.
  if (0==bmsg[i])
     ww << "0";
  if (16>bmsg[i])
     ww << "0";
  ww << std::hex << 0+bmsg[i];
}

A request: the "read" method should be in standard (STL) C++

Thanks.

Answer
Well as STL implies the Standard Template Library portion of the C++ standard library that includes containers, iterators, algorithms and the like and not things like IOStreams and Locales I think that you are out of luck.

However if you really meant a hosted implementation of ISO standard C++ that includes the full C++ standard library then I think you will be in more luck.

As an aside, note that there are at least two versions of ISO Standard C++: the original 1998 version and the primarily bug-fix 2003 update. There is also the TR1 non-normative additions to the C++ standard library (non-normative means an implementation does not need to include such features to comply with the standard).

An observation on the code presented indicates that you do not require the test for a value being zero as zero is also less than 16!

Also are you sure your IOStream library implementation is 'broken' - I think it is not and your code is 'broken' in that it should ask the stream to format the output as you require - a width of 2 with leading zeros:

   ww << std::hex << std::setw(2) << std::setfill('0') << 0+bmsg[i];

You should include <iomanip> to use std::setw and std::setfill etc.

You can also convert the Byte (a type alias for unsigned char I presume) to an int like so:

   ww << std::hex << std::setw(2) << std::setfill('0') << int(bmsg[i]);

You can also set the field width, fill and base directly on the stream:

   std::ofstream ww("test.txt");
   ww.setf(std::ios::hex, std::ios::basefield);
   ww.fill('0');

   Byte bmsg[40];
   // bmsg[0..15] gets binary values here,
   for (int i=0; i<16; i++)
   {
       ww.width(2);        // *must* set each time as reset after formatted I/O operations
       ww << int(bmsg[i]);
   }

Notice that the width is reset to its default value of 0 after each formatted input or output operations (i.e. via operator>> or operator<<), however other values remain in whatever state they were last set to so we only need to set them once.

Now reading the values back in again requires that we read them initially as strings of length 2. This is because for formatted input width is _only_ used when reading strings.

   size_t const          MessageSize(40);

   size_t const          HexFieldsBeginIndex(0);
   size_t const          HexFieldsEndIndex(16); // STL style 1 past the end
   std::streamsize const HexFieldWidth(2);

   char const            DataFileName[] = "test.txt";

  /// ...

   std::ifstream rr(DataFileName);
   rr.setf(std::ios::hex, std::ios::basefield);
   Byte rmsg[MessageSize];
   
   if ( rr.is_open() )
   {
     std::cout << "Read strings: ";

     for ( int i=HexFieldsBeginIndex; rr && i != HexFieldsEndIndex; ++i )
     {
         rr.width(HexFieldWidth);
         std::string strHexValue;
         rr >> strHexValue;
         std::cout << strHexValue << ' ';
     }
     std::cout << std::endl;
     
   }
   else
   {
       std::cerr << "Failed to open data file " << DataFileName << '.' << std::endl;
       return 1; // from example code main
   }

Note that my examples presume they are in an example test program's main function or similar that returns an integer value to the caller.

We could also use the likes of the get or getline input stream member functions with a count value of 2 - although we would have to read the data into a char array and not have the option of using a std::string:

     for ( int i=HexFieldsBeginIndex; rr && i != HexFieldsEndIndex; ++i )
     {
         size_t const HexCStringBufferSize(HexFieldWidth+1);
         char strHexValue[HexCStringBufferSize];
         rr.get(strHexValue, HexCStringBufferSize);
         std::cout << strHexValue << ' ';
     }

Notice that I have gone to the bother of defining names for most of the magic values such as 40, 16, 2 test.txt. I am also using a message buffer called rmsg which is defined ready to be filled in but not used yet.

Once we have the data in string form - either a std::string or a C-style string we have to convert this value to an integer value. There are two obvious ways to do this:

   - use a std::istringstream using the read std::string data as an input source
      and extract the input into an integer

   - use the underlying locale num_get facet support for numeric conversions.

The third option is to do the conversion 'by hand' and write code to do it explicitly.

The first option would look like so:

   for ( int i=HexFieldsBeginIndex; rr && i != HexFieldsEndIndex; ++i )
   {
       rr.width(HexFieldWidth);
       std::string strHexValue;
       rr >> strHexValue;

       std::istringstream ss(strHexValue);
       ss.setf(std::ios::hex, std::ios::basefield);
       int intHexValue(0);
       ss >> intHexValue;
       rmsg[i] = static_cast<Byte>(intHexValue);
   }

Here I define a std::istringstream (include <sstream>) and initialise it with the previously read strHexValue std::string. I then set the basefield flag on this stream to std::ios::hex for hexadecimal formatted integer character streams and  extract the field value into a temporary int to get around the possibility of unsigned char being interpreted as characters rather than integers. I then assign the int to the value of the relevant rmsg Byte field item. The static_cast to Byte is to suppress compiler warnings about loosing data as int is generally a larger type than char types. Note in this version setting the std::ifstream std::ios:hex flag is redundant.

If you were using a C string zero terminated char array buffer rather than a std::string then you would have to use the depreciated std::istrstream char * C-string stream (include <strstream>). Or you could convert the buffer to a std::string first (e.g. create a std::string initialised using the C string buffer).

The second option would look as follows:

   for ( int i=HexFieldsBeginIndex; rr && i != HexFieldsEndIndex; ++i )
   {
       rr.width(HexFieldWidth);
       std::string strHexValue;
       rr >> strHexValue;

       std::num_get<char, std::string::iterator> const &
         numGetFacet(  std::use_facet
                         < std::num_get< char
                                       , std::string::iterator
                                       >
                         >( std::locale() )
                    );
       unsigned short ushortHexValue(0);
       std::ios_base::iostate err(0);
       numGetFacet.get(strHexValue.begin(), strHexValue.end(), rr, err, ushortHexValue );

       if ( (err&std::ios_base::failbit) || (err&std::ios_base::badbit) )
       {
         std::cerr << "Bad data in data file " << DataFileName << '.' << std::endl;
         return 2; // from example code main
       }

       rmsg[i] = static_cast<Byte>(ushortHexValue);
   }

First I obtain a std::num_get facet from the current locale - this had better match the locale in force at the time the data was written - if not then use a specific locale matching that used to write the data. Note that unless the code actually makes an effort to set locales and facets then the default locale will always be the C locale, with the default C locale facets.

As you can see I need to ask for a std::num_get facet (include <locale>) that matches both the character type in use and the type of the iterators we are going to use to provide the sequence of characters to convert to a numeric value. If you were using a C string buffer rather than a std::string then you could specify char * as the iterator type.

We have to call a get member function on the (reference to) this facet. I chose the overload taking an unsigned short - as there is such an overload of get for this type but not one for int (it is handled by that for long).

As you can see we pass in a sequence of characters specified by start and end iterator, a stream from which to take formatting flags (which is why I set the std::ios::hex flag on the rr stream), a reference to an object to pass back IOStream-style error flags and a reference to an object to accept the converted value.

If you were using a C string buffer the call to the get member function would look something like:

     numGetFacet.get(strHexValue, strHexValue+HexCStringBufferSize, rr, err, ushortHexValue );

After returning from the num_get get call I check to see if any errors occurred - specifically whether the fail bit or bad bit are set - and return if either are set.

If all is OK I set the relevant Byte in rmsg similarly to that for the std::istringstream case.

For more information on the C++ standard library I suggest you obtain _and use_ a good reference such as "The C++ Standard Library A Tutorial and Reference" by Nicolai M. Josuttis or - specifically for IOStreams and Locales - "Standard C++ IOStreams and Locales" by Langer and Kreft.

Hope this is of use.

C++

All Answers


Answers by Expert:


Ask Experts

Volunteer


Ralph McArdell

Expertise

I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.

Experience

My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.

Education/Credentials

©2012 About.com, a part of The New York Times Company. All rights reserved.