C++/Strings
Expert: Ralph McArdell - 6/26/2006
QuestionI understand how to write a single word into a string variable i also understand how to write a sentence into a string variable if it is one line. I would like to know how you write a sentence into a string if the sentence is more than two or more lines long.
AnswerYou concatenate each line to the end of the string. You will probably have to also append a newline ('\n') before each line, or if you do not wish to preserve the lines then a space will do instead. To do this you use 2 strings - one containing the characters for each line and the other is the whole sentence being built from the lines.
The pseudo code looks like this:
For each line:
Read line into line_string
If sentence_string is not empty:
Append newline to sentence_string
End If
Append line_string to sentence_string
End For Each
When translating into C++ you will find the 'for each line' logic a little more complex - it is easier to use a bottom tested do...while loop. Further, the easiest string type to use for the sentence is a C++ std::string, but the type that is passed to and from istream::getline is a C-style char array. The following is an example program to read many lines and build a sentence string from them:
#include <string> // C++ string classes for std::sting
#include <cstring> // C string declarations for strlen
#include <iostream> // C++ iostreams for std::cin and std::cout
int main()
{
// line_string is a C-style char array for std::istream::getline
size_t const MaxLineLength(1024);
char line_string[ MaxLineLength ];
// sentence_string can be a std::string however.
std::string sentence_string;
// The line length is used to detect when we are done
size_t line_length( 0 );
// Prompt user
std::cout << "Please enter text. "
"It can use many lines. "
"Enter a blank line to stop:\n"
;
// Start line reading, sentence building loop
do
{
// Read line and obtain line length
std::cin.getline( line_string, MaxLineLength );
line_length = strlen( line_string );
// Don't process line_string if it is empty
if ( line_length > 0 )
{
// If we already have text in the sentence append a newline
if ( ! sentence_string.empty() )
{
sentence_string += '\n';
}
// Append line_string text to sentence_string
sentence_string += line_string;
}
} // Continue while line_stirng not empty
while ( line_length > 0 );
// Display final sentence_string value:
std::cout << "Sentence:\n" << sentence_string << '\n';
system( "pause" );
}
If sentence_string were also a C-style char array then you would use
strcat( sentence_string, line_string );
However strcat does _not_ allocate more space for the sentence_string array if it is needed as std::string will. Instead strcat will try to copy the data off the end of the array if there is not enough room.
If you are lucky this will cause the program to crash immediately and if you are not will either cause a crash later or other values to be 'mysteriously' changed because the memory they occupied was overwritten. This sort of thing is called a 'buffer overrun' and is a major source of security problems in software.
The result is that you have to take a decision as to what to do if the remaining space in sentence_string is less than the number of characters in line_string. Some options are:
1/ Make sentence_string very large (say char senetence_string[50000]), and ignore the problem of writing off the end of sentence_string if there is insufficient space.
2/ Make sentence_string large and check that there is sufficient space for strcat to operate safely.
3/ Make sentence_string initially large enough for most sentences and allocate more space as needed, copying the data from the old senetence_string memory to the new using strcpy. Note that strcpy has similar potential for buffer overruns as strcat.
The first option tries to prevent the problem by using a sentence array size so large that no sentence should ever be that long. It will eventually lead to a crash or other problems as people start using your sentence for other purposes - such as whole essays.
The second option is safe but may prevent some people's sentences being accepted because they are too long - hence the size has to be large enough for the longest conceivable sentence.
The final option allows the space used by sentence_string to be small at first and grow as needed. This is effectively what std::string does for you. However you have to handle the memory management yourself and this is not pretty and is error prone. Here is one implementation that I show as an example of the sort of thing that has to be done:
void Append
( char *& appended_to_string
, char * const appended_string
, size_t & append_to_max_size
)
{
size_t append_to_length( strlen(appended_to_string) );
size_t appended_length( strlen(appended_string) );
if ( (append_to_length + appended_length) >= append_to_max_size )
{
append_to_max_size = (append_to_length + appended_length);
append_to_max_size += append_to_max_size/2;
char * new_append_to_buffer = new char[append_to_max_size];
strcpy( new_append_to_buffer, appended_to_string );
delete [] appended_to_string;
appended_to_string = new_append_to_buffer;
}
strcat( appended_to_string, appended_string );
}
This code has to be used with data that is correctly formed to start with. The appended_to_string has to be initially dynamically created using new [] (and destroyed using delete []). Its first character must be set to zero ( '\0' ) to ensure the C-string is initialised to zero length correctly. Both the appended_to_string and append_to_max_size may be modified in the Append function so must be output parameters as well as input parameters, and the passed items cannot be const (as you might use for the sentence_string maximum size value).
Calls to Append would have to be made anywhere I use += to append to sentence_string in the original example (and '\n' has to be passed as a string literal: "\n").
Finally, memory is a finite resource and so may become exhausted anytime more is requested. This applies to allocations done by std::string and those in Append and elsewhere. I have not shown error handling for memory exhaustion to keep the code simpler.