C++/Count the number of occurrences of a word in string
Hi, Im trying to learn C++. Im a beginner. I have written a code to count the number of occurrences of a letter in a string. But Im not sure how to count the number of occurrences of a word in string.
For example, this is my string:
"Hello there, take care and be happy and smile"
I use this code to count the letter 't'. and the output is 2.
int countVowels(char str)
int count = 0, k = 0;
while (str[k] != '\0')
if (str[k] == 't')
How do I count how many 'and's in the string? I used the same code and change the 't' to 'and' but the output is 0.
thank you in advance!
Well I would say for a start that you are learning C rather than C++ - or those parts of C++ inherited from C.
Secondly, if you are trying to count vowels why are you in fact counting lowercase 't'? Maybe the function should be called countLowerTees or something?
Now counting words can be tricky - I mean how do you define a word?
Count the occurrences of "and" in:
Should the result be 0, 2 or 3?
"hand and band"
"There were blue and green and, to make it hard we used punctuation"
So first decide how you wish to break your string into words, break it into words, then compare a each word against your required word, for which you can use the C (and C++) standard library function strcmp. Here is a code snippet modified from your original posting:
char * pWord = GetWord( str );
while ( pWord != NULL )
if ( 0==strcmp( pWord, "and") )
pWord = GetWord( NULL );
strcmp returns 0 if both strings match (meaning no difference). It returns a value < 0 if the first string is lexicographically less than the second and value > 0 if the first string is lexicographically greater than the second.
Now there is a second function that can do the job of GetWord in the above example - it is called strtok and it splits a string into tokens (like words) broken on any of a set of characters provided as a second argument, so we could write GetWord like so:
char * GetWord( char * str )
return strtok( str, " ,;:!.?" );
Notice that str is passes as a non- constant pointer. This indicates that strtok (and therefore GetWord) will modify the passed in string. In fact it writes '\0' characters to the string to temporarily terminate each word at the right point.
I specified a likely selection of characters that may follow the end of a word - such as space, comma, full stop etc - as the second word ending (or delimiting) set of characters.
Now these string functions are declared in a standard header file, called string.h for C or cstring (no .h) for C++. So you have to include it in your program:
In standard C++ the names of C library functions are in both the global namespace and the namespace std (like most of the C++ only standard library names). This will probably not mean much to you yet but may be helpful later!
Now the next question is why are you using C-style strings - meaning arrays of zero terminated character arrays? The C++ standard include a string class std::string which relieves you of many tedious details, although the facilities of strtok are not one of them – ho hum such is life.
Here is a simple example code fragment using the C++ string class:
#include <string> // C++ library include for std::string
("Hello there, take care and be happy and smile");
std::string::size_type and_pos( 0 );
while ( and_pos!=std::string::npos )
and_pos = str.find("and", and_pos );
if ( and_pos != std::string::npos )
and_pos += 3; // start next search after this "and"
Here I have hard coded "and" and its length of 3 - not a good idea. However we could write a function like so:
( std::string const & str, std::string const & word )
std::string::size_type word_pos( 0 );
while ( word_pos!=std::string::npos )
word_pos = str.find(word, word_pos );
if ( word_pos != std::string::npos )
// start next search after this word
word_pos += word.length();
Note that this is fairly easy to at least partially understand with operations on the string objects such as find and length. However this simple example does loose the ability to differentiate and as a whole word or and as part of another word as in hand...
You will have noticed several things about my style of writing C++ - initialising objects using initialiser syntax (i.e. int a(1); instead of int a = 1;). Preferring pre-increment (++a) (and decrement) to post increment (a++) (and decrement). And of course the use of fully qualified names - that is, ones using :: to specify the parts of the name in the various namespaces. Namespaces are created by using namespace or when defining a class - so the C++ string class is a name in the std namespace and it contains the names npos and size_type - hence std::string::npos and std::string:size_type.
In fact the standard C++ library contains a great many useful things such as collection classes for various types of collections such as lists and vectors, algorithms to perform operations on such collections such as sort or find, and input / output streams to read and write to files, consoles and other things such as strings. I suggest you get acquainted with it as well as the core language. You might try Accelerated C++ by Koenig and Moo - however I have heard that this book is better for people who have some experience with programming already so maybe a new book called "You Can Do It - A Beginner's Introduction to Computer Programming" by Francis Glassborow and Roberta Allen.
As to other sources of information. A good place to start is the ACCU site at http://www.accu.org
- they have a book reviews section and resource links as well as mentored development areas for members.
You might like to look at the C++ links at:
However I cannot vouch for the quality of anything you find there - some I know of such as the "The On-Line C++ FAQ" by Marshall Cline on the parashift site, and indeed is the first place you are directed to before posting a question on the comp.lang.c++.moderated newsgroup.