C++/STRING MAINPULATION
Expert: Ralph McArdell - 9/17/2006
QuestionIf you are reading strings of varying lengths from lines of a program with a # delineator, how do you erase what is behind the delineator to form one string and what is in front of the delineator to form a second string?
For example the program contains the lines:
Anderson#John
Gross#William
Vasquez#Jesse
and you want the output:
Anderson, John
Gross, William
Vasquez, Jesse
Thank you in advance for your time.
AnswerFirst I think you are confusing 'program' with 'data read by the program'. The example you show looks like no programming language I know. It does however look like data that a program might work on.
At the most fundamental level you read along the string one character at a time until you find a delimiting character. Everything you have read so far is the first string, everything that follows from the character after the delimiter to the end of the string will be the second string.
What I do not see is why you would want to erase anything. In C for example you could just replace the delimiter with a zero character ('\0') as C style strings are just arrays of characters which are zero terminated to indicate the end of a string:
/* One of the lines of data from your example */
const char Line1[] = "Anderson#John";
char name[ 100 ];
/* Point secondName at first character of whole name string */
char * secondName = name;
/* Make a mutable copy of the line of data */
strcpy(name, Line1);
/* Point secondName at each successive character until either
the end of the string is reached or a delimiter character
is encountered
*/
while ( *secondName != '\0' && *secondName !='#' )
{
++secondName;
}
/* If secondName points at a delimiter, replace it by a zero
character to terminate the first name part of the string and
move it forward to point to the first character of the second
name part; name and secondName are now two C-style strings.
*/
if ( '#'==*secondName )
{
*secondName = '\0';
++secondName;
}
/* Display results in example format */
printf("%s, %s\n", name, secondName );
The above demonstrates the basics of manipulating strings - character by character, moving along the string, look at what we have and reacting accordingly.
Of course we could use higher level operations to help us that wrap this sort of tedious detail up for us. For example the C library contains a function called strtok that splits individual tokens out of a string, with the tokens being separated by delimiters:
/* One of the lines of data from your example */
const char Line1[] = "Anderson#John";
char name[ 100 ];
char * part1;
char * part2;
/* Make a mutable copy of this line of data */
strcpy(name, Line1);
/* Break name tokens out of name using strtok */
part1 = strtok( name, "#" );
part2 = strtok( NULL, "#" );
/* Display results in example format */
printf("%s, %s\n", part1, part2 );
Now the last time I looked I was an expert in the C++ section, not the C section, so how might we go about splitting strings in C++? The C++ standard library has a string class in the std namespace: std::string. This has all sorts of useful string operations which we can make use of:
// One of the lines of data from your example
const char Line1[] = "Anderson#John";
// Make a std::string copy of the const char[] line of data
std::string name( Line1 );
// Get index position of delimiter
std::string::size_type delimiter_position( name.find('#') );
// If we found a delimiter character, proceed
if ( std::string::npos != delimiter_position )
{
// Found # character...
// Create two new strings for each part of the name
// using the substr operation specifying the 0th start and
// delimiter_position number of characters for the first part
// and one past the delinter to the end of the string for the
// second part
std::string part1( name.substr(0, delimiter_position) );
std::string part2( name.substr(delimiter_position+1) );
// Display results in example format
std::cout << part1 << ", " << part2 << '\n';
}
Here we have to do a little more work than using strtok but not much. We first locate the position of the # delimiter, then we split the string into to part substrings using the substr operations: the first being up to but not including the # character and the second being from the character after the delimiter to the end of the string.
An alternative would be to replace the "#" with ", " in the name string, leaving it as 1 string but re-formatted directly to the example format:
// One of the lines of data from your example
const char Line1[] = "Anderson#John";
// Make a std::string copy of the const char[] line of data
std::string name( Line1 );
// Get index position of delimiter
std::string::size_type delimiter_position( name.find('#') );
// If we found a delimiter character, proceed
if ( std::string::npos != delimiter_position )
{
// Found # character...
// Replace the # delimiter with comma and space to form
// a single string in the example format:
name.replace( delimiter_position, 1, ", " );
// Display results in example format
std::cout << name << '\n';
}
The code I have shown are just example fragments. They will work but need to be extended to be really useful. For example I have not added much error checking - such as making sure the lines of data do not exceed 99 characters in the C examples, and handling the case when a delimiter is _not_ found in the C++ examples. I have also used hard coded magic values such as 100 and '#' which are also generally a bad idea. Finally I have only used a single one of your example lines of data pre-defined as a constant to keep the examples simple.
On an ending note I have not really explained much about the details of the likes of strtok and std::string. This is because these are standard library facilities and are documented all over the place and there seems little to be gained by me just repeating such material. Try typing strtok or std::string into Google or other search engine for example.