You are here:

C++/Send Struct from client to server

Advertisement


Question
Hi,
I am working on a project in C++ on Linux.I want to send send a struct which consist of 2 Matrices and some variables.So I have combined them in a struct and using write to send it to client.On the client side I am using read to receive the matrix.But I am unable to get proper data as I am getting just the garbage values.Can you please help me in Solving the problem.

Answer
I cannot give any specific advice as your question lacks details, but I can point out some of the sorts of reasons this type of technique fails or if it works is extremely fragile and likely to break or be tricky to maintain in the future.

[note: although struct implies you may be using C as you are asking a C++ expert I may slip into using some C++ style in any example code snippets]

There are several potential problems with this approach:

Firstly, if the struct contains pointers (or for that matter C++ references) to data then just writing the contents of an instance of the struct will write the values of the pointers - which will be meaningless to the client - even if the client is running in a separate process on the same machine:

   struct DataMessage
   {
       Matrix * matrix_1_pointer;
       Matrix * matrix_2_pointer;
       int      var1;
       int      var2;
   };

Using a compiler targeting a 32-bit, 8-bit byte addressed machine in which pointers and ints are 32-bits in size the above struct will most likely have a size of 16 (each member is 4 bytes in size). In memory it might look as follows (with a supposed memory address on the left followed by the values at that and any immediately following addresses, with values in hexadecimal):

   0x00c000a0 : 0x00b0af80     // matrix_1_pointer
   0x00c000a4 : 0x00b08400     // matrix_2_pointer
   0x00c000a8 : 0x00000123     // var1
   0x00c000ac : 0x00000004     // var2

Just blatting the above 16 bytes over to the client sends them pointers into the server process' memory space and misses the data contents of the matrices.

Note that pointers may be hidden inside other objects - such as C++ std::string objects for example, or maybe within objects of some matrix class (assume objects of the Matrix type used in my examples do not contain pointers or references).

A second problem is that compilers are allowed to add extra padding between struct members to ensure good - or even required - alignment of data on memory boundaries. For example if we re-wrote DataMessage with the var1 and var2 members as single byte chars and interleaved with the matrix pointers:

   struct DataMessage
   {
       char     var1;
       Matrix * matrix_1_pointer;
       char     var2;
       Matrix * matrix_2_pointer;
   };

Then it is quite likely that the size of the struct when compiled for our 32-bit, 8-bit byte addressed machine is still 16, as the pointers are likely to require or prefer for performance reasons to be aligned on a 4-byte boundary. In memory an instance might look like so:

   0x00c000a0 : 0x23          // var1
   0x00c000a1 : 0xdeadbf       // 3 bytes of padding containing 'random' rubbish values
   0x00c000a4 : 0x00b0af80     // matrix_1_pointer
   0x00c000a8 : 0x04          // var2
   0x00c000a9 : 0xddbeef       // 3 bytes of padding containing 'random' rubbish values
   0x00c000ac : 0x00b08400     // matrix_2_pointer

Worse still such padding differs depending on compiler. target platform and maybe even what compilation options are in effect. This would be a problem if your client expected different struct padding to that used by the sending server. In addition sending the padding needlessly increases the volume of data sent. One final problem is that the 'random' rubbish values in the padding space is likely to be whatever was in the memory previously and this _might_ be security sensitive - so sending such data could pose a potential security risk.

The next problem is that of byte-ordering of values larger than a byte. As soon as you have two or more items you have more than one way to arrange them and this is true of values that are larger than a single byte. Or, in memory, the ordering of values that span more than one memory address. This ordering is termed the endianness of the processor (see http://en.wikipedia.org/wiki/Endianness for lots of details). Most modern machines address data in chunks of 8-bit bytes or multiples thereof. So assuming we have two byte addressed 32-bit machines we can have various arrangements in memory of the 4-bytes that make up a 32-bit int. The two most common are:

   big-endian - the byte at the address of the int is the most significant byte of the 4-byte int value,
   little-endian - the byte at the address of the int is the least significant byte of the 4-byte int value,

In addition IP networking has a concept of a network byte ordering - see for example htonl and ntohl (see http://linux.die.net/man/3/htons for example for more information).

Obviously trying to interpret data of one endianness assuming some other endianness will yield rubbish.

Another obvious potential problem would be differing data formats for built in types - such as the sizes of built in types, signed integer representation and floating point representation. This is a very real possible problem if your client and server code are build using different compilers or platform targets. For example MSVC++ for x86 uses a 64-bit long double type whereas GNU for x86 uses an 80-bit format. 32-bit x86 and other compilers often have 16-bit short ints, 32-bit int and long ints, however 64-bit x86-64 and other 64-bit targets might have 32-bit int, 64 bit long ints, or 64 bit ints and long ints, or 32-bit ints and long ints (and save 64 bit integer formats for long long ints). Signed integer formats may also differ - although 2s compliment is by far and away the most common for modern processors. Use of IEEE floating point formats is now common but other proprietary formats may still be around.

One way around the size problem for integers is to use type aliases from the C stdint.h header (and more recently the C++ cstdint header) such as int32_t (see for example http://pubs.opengroup.org/onlinepubs/007904975/basedefs/stdint.h.html).

Other problems include structure versioning ( e.g. if the client and server are at different versions and maybe some of the data stucture formats differ between versions) and data members that do not need to be transferred.

So what should you do?

Firstly, if not done already, flatten your data - i.e. make all externally referenced data internal to the struct or class used for communications - so instead of:

   struct DataMessage
   {
       Matrix * matrix_1_pointer;
       Matrix * matrix_2_pointer;
       int      var1;
       int      var2;
   };

use:

   struct DataMessage
   {
       Matrix matrix_1;
       Matrix matrix_2;
       int      var1;
       int      var2;
   };

Note this only needs to be done for message transport purposes - you can keep two sets of structs and / or classes - those for in-memory processing and flat representations for client<->server communication.

Be very careful about alignment and padding. If it seems to be a problem try copying individual values into and out of a byte-buffer and just send the raw bytes. In fact doing this would also flatten the data as well!

Always send data in network byte order and translate it to host endianness on receipt.

Putting all these points together write two functions for each data-structure or data-structure network you wish to communicate that take such a structure and producesa byte message buffer (array of unsigned char) and an inverse function that takes a buffer of bytes and produces an in-memory structure. In your struct definitions try and use machine-neutral typedefs for integers such as int32_t from cstdint / stdint.h.

So for:

   struct DataMessage
   {
       Matrix * matrix_1_pointer;
       Matrix * matrix_2_pointer;
       int      var1;
       int      var2;
   };

We might start by using int32_t instead of int:

   struct DataMessage
   {
       Matrix * matrix_1_pointer;
       Matrix * matrix_2_pointer;
       int32_t  var1;
       int32_t  var2;
   };

Then we might write four functions:

   size_t matrix_to_msg_buf( Matrix const & m, unsigned char * buf, size_t length );
   size_t msg_buf_to_matrix( unsigned char const * buf, size_t length, Matrix & m );

   size_t datamessage_to_msg_buf( DataMessage const & m, unsigned char * buf, size_t length );
   size_t msg_buf_to_datamessage( unsigned char const * buf, size_t length, DataMessage & m );

In which a size_t returned value is 0 if there was an error (e.g. too little room in the buffer), or the number of bytes written to or from the buffer.

To give an idea of how such functions' implementation would look below is an initial quick attempt at the implementation of datamessage_to_msg_buf:

   size_t datamessage_to_msg_buf( DataMessage const & m, unsigned char * buf, size_t length )
   {
       unsigned char * buf_end( buf + length );
       unsigned char * buf_pos( buf );

       if ( buf_pos + sizeof(uint32_t) <= buf_end )
       {
         uint32_t netval = htonl(static_cast<uint32_t>(m.var1));
         memcpy( buf_pos, &netval, sizeof(uint32_t) );
         buf_pos += sizeof(uint32_t);
       }

       if ( buf_pos + sizeof(uint32_t) <= buf_end )
       {
         uint32_t netval = htonl(static_cast<uint32_t>(m.var2));
         memcpy( buf_pos, &netval, sizeof(uint32_t) );
         buf_pos += sizeof(uint32_t);
       }

       size_t len(0);
       if ( buf_pos < buf_end )
       {
         len = matrix_to_msg_buf(*m.matrix_1_pointer, buf_pos, buf_end - buf_pos);
       }

       if ( 0 != len )
       {
         buf_pos += len;
         len = matrix_to_msg_buf(*m.matrix_2_pointer, buf, buf_end - buf_pos);
       }
       return len==0 ? 0 : buf_pos + len - buf;
   }

The above is untested (and therefore may contain some bugs) and relies on calling matrix_to_msg_buf to fill out the data for the referenced matrices and is only intended to give a rough idea of the sort of steps required. The functions do use C++ reference types for parameters but I have stayed away from other C++ facilities such as exceptions in the hope that the ideas will translate easier to C if necessary.

Another approach is to serialise the in memory data and de-serialise the received messages (as such processes are called) by converting all data to a text (string) format using C++ ( operator>> and operator<<, std::istringstream and std::ostringstream, etc.) or C formatted IO (sprintf, sscanf etc.) and the like to build a text message buffer, send the text message and then parse the received text to create equivalent in-memory data structures. This approach is best done using byte encoded character data (e.g. ASCII, UTF8) rather than wide character types (e.g. UTF16, UTF32). It is also useful in the case where the binary formats for data types (e.g. floating point formats) differs between machines.

In C++ you could use classes and member functions - maybe define an abstract base class that declares pure virtual member functions to be implemented by serialisable data types. This would be only one way to approach the problem.

Hope this gives you some hints.

C++

All Answers


Answers by Expert:


Ask Experts

Volunteer


Ralph McArdell

Expertise

I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.

Experience

My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.

Education/Credentials

©2016 About.com. All rights reserved.