C++/Thousand Separator
Expert: Ralph McArdell - 8/23/2008
QuestionHi,
i've searched through the internet, and found a few of information on how to properly format a string of numbers into output that has thousands (comma) separators.
But I couldn't find any that can help me do an error check on an input that recognized an error in input such as: 10,00
I've tried using
1) atof(argv[i]) to convert the parameter input into double
2) and then use the str.format to format the double value with thousand separator
3) ONLY THEN I compare the original input with the formatted number. If different, then it'll show an error message, if it's okay, then it should continue to the next error check.
Is there an easier way to correctly identify the correct use of comma in 10,00 or 1,000 (which is correct)?
Thanks in advance
AnswerI note that you have tried converting the value using a standard C library function that of course the C++ standard library also includes. Hence I am unsure if you are asking this for use in a C program or a C++ program.
However I am a C++ expert, and you have asked this question of me in the C++ section of AllExperts and not the C section so I shall assume you are using C++.
The short answer, you will be pleased to know, is 'yes'.
You use C++ IOStreams and locales.
----------------------------------------------------------------------------------------------------------------
Note: I am going to assume you are using a reasonably ISO standard C++
language and library implementation.
This is reasonable; the ISO C++ standard was published about 10 years
ago so most C++ implementations should be fairly compliant by now.
----------------------------------------------------------------------------------------------------------------
By default the C++ IOStreams use the C locale for localisation settings such as numeric formatting. This can be changed for a a stream quite easily by passing the new locale to the stream's imbue member function.
We can use the constructor of the std::locale class to build a locale for a specific region. The names used for these regions differs between C++ library implementation unfortunately, however the empty name "" is used to mean we wish a locale consistent with that of the current system settings. We can therefore write something like this:
#include <iostream>
#include <locale>
//...
std::locale system_locale("");
std::cout.imbue(system_locale);
std::cin.imbue(system_locale);
Which imbues the standard console output and input streams with a locale consistent with the currently set system locale.
We can make use of the stream state functions to test for specific conditions, so we could write something like so:
#include <iostream>
#include <locale>
int main()
{
std::locale system_locale("");
std::cout.imbue(system_locale);
std::cin.imbue(system_locale);
std::cout << "x=";
double x(0.0);
std::cin >> x;
std::cout << "std::cin ";
if ( std::cin.good() )
{
std::cout << "is good\n";
}
else
{
if ( std::cin.bad() )
{
std::cout << "is fatally bad ";
}
else if ( std::cin.fail() )
{
std::cout << "has formatting failure ";
}
if ( std::cin.eof() )
{
std::cout << " is at EOF ";
}
std::cout << std::endl;
}
return 0;
}
In which we first prompt for and read in a double value then write out text describing the state of std::cin.
Assuming we have our system settings set to a locale which uses commas for thousands separators (such as the UK or USA) then we should find that entering 1,000 gives:
std::cin is good
as the output and entering 10,00 gives
std::cin has formatting failure
Ok, you say this is all well and good but I am not reading my data from the console, nor from a file. I am using command line argument values.
Ok I reply, then use a std::istringstream which reads input from a std::string.
We could use a std::istrstream object instead, which in this case is possible a more obvious choice as it works with char * (i.e. character arrays). However std::istrstream and std::osstrstream are listed as compatibility features in the ISO C++ standard for backwards compatibility with pre-standard C++ library implementations and code using them. They are documented as "deprecated features" and the standard goes on to say
"where deprecated is defined as: Normative for the current edition of the
Standard, but not guaranteed to be part of the Standard in future revisions."
(the term "Normative" means it has to be included in ISO standard C++ compliant C++ implementations)
To use a std::istringstream we create an object of that type passing it the string to use as input. If we pass a C-style zero terminated array of char then it will be converted to a std::string. We have to include <sstream>. Thus we could write:
#include <sstream> // for std::istringstream
// ...
std::istringstream iStrStrm( argv[1] );
We then replace all uses of std::cin in the original program with iStrStrm (also modifying the declaration of main to take command line arguments):
#include <iostream>
#include <sstream> // for std::istringstream
#include <locale>
int main( int argc, char * argv[] )
{
if ( argc < 2 )
{
std::cerr << "Error: require at least one numeric command line argument."
<< std::endl;
return 1;
}
std::locale system_locale("");
std::cout.imbue(system_locale);
std::istringstream iStrStrm( argv[1] );
iStrStrm.imbue( system_locale ) ;
double x(0.0);
iStrStrm >> x;
std::cout << "x=" << x
<< "\nstd::cin "
;
if ( iStrStrm.good() )
{
std::cout << "is good\n";
}
else
{
if ( iStrStrm.bad() )
{
std::cout << "is fatally bad ";
}
else if ( iStrStrm.fail() )
{
std::cout << "has formatting failure ";
}
if ( iStrStrm.eof() )
{
std::cout << "is at EOF ";
}
std::cout << std::endl;
}
return 0;
}
In this version I have also printed out the value of x read in from the string stream.
Now if we try it we should see something like the following:
For:
theProgram 1,000
The output should be something like:
x=1000
std::cin is at EOF
For:
theProgram 10,00
The output should be something like:
x=1000
std::cin has formatting failure is at EOF
You will note that in this case we end up at the end of the stream. This is because there is no whitespace around command line arguments so the stream found the end of the string and interpreted that to be the end of file condition.
The fact that even though the formatting fails x was still set correctly to 1000 is specific to this library implementation (g++ 4.2.3 under 64-bit Ubuntu Linux 8.0.4). Using Microsoft Visual C++ 2005 x is _not_ set in the case of failure (i.e. it remains at its initial setting of 0.0). Thus you should _never_ use a value extracted from a stream if that stream is in a bad or failed state.
NYou can set the locale to a specific locale if, say, your system is set up for mainland European or other locale that does not use commas as thousand separators. For example we could set the locale to the USA. For the Microsoft C++ library we can use:
std::locale usa_locale("American_USA.1252");
And for the GNU (g++) C++ library we can use:
std::locale usa_locale("en_US.UTF-8");
I had to find examples for usages for these locale names - the Microsoft name I found in an example in a book and the GNU C++ library locale name I found in an old forum question online!
Which locales are available is up to the library implementers so you might be really unlucky and only get the classic C locale included. Let us hope this is not the case.
Finally we can set the locale in other ways. Firstly we can do away with the named locale variable and just pass a temporary locale to imbue:
std::cin.imbue( std::locale("") ); // imbue std::cin with current system locale
Next we can set the default locale for any streams created _after_ we have made the call:
std::locale::global( system_locale );
or:
std:: locale::global( std::locale("") );
If after this we create a stream we do not need to imbue it explicitly with the locale if the globally set locale is the one we want:
std::istringstream iStrStrm( argv[1] );
// Note no call to iStrStrm.imbue( system_locale ) required.
double x(0.0);
iStrStrm >> x;
I have only touched the very basics here. IOStreams and locales and the individual localisation facets that make up a locale are a large topic which can be a bit scary at first - I certainly found it so until I read up on locales a bit more.
I should mention that in addition to the C++ locale support there is also the legacy C locale support. And of course there is probably native operating system support - especially for non-UN*X like systems that use a C locale system such as Microsoft systems. However the method presented here seems to me to be the easiest way to do what you wish - and easy is what you were asking for <g>.
I suggest you get yourself a good reference for the C++ standard library in general such as "The C++ Standard Library a Tutorial and Reference" by Nicolai M. Josuttis. For more detailed information specifically on IOStreams and locales there is "Standard C++ IOStreams and Locales" by Angelika Langer and Klaus Kreft.
Hope this at least gets you moving forward and provides references that you can expand on the brief information presented here.