i can't understand the code:

std::sort(theList.begin(), theList.end(),  compare);

the compare function get just two strings(that are our keys)then if the first is greater, the compare is true in this function:

int compare(const string& s1, const string& s2)
   string a1 = s1.substr(0, s1.find(' ') );
   string a2 = s2.substr(0, s2.find(' ') );
   return a1 > a2;

the sort function just sorts these two strings?
i don't know after sorting the first two strings what does the functiosort n do?
does it then get the next two stings and sort them?
but in my example i don't understand how it works,plz just check it out:

7 , 5 , 3 , 2

( 5 , 7) , 3 , 2 <---------- the first two strings

5 , 7 , (2 , 3) <----------the next two strings

but as we see it the numbers didn't sort.
so how does the sort function work?

and could you  plz tell me what exactly does compare do in the code below?

int x = std::binary_search(theList.begin(), theList.end(), value, compare);


The std::sort algorithm works with random access iterators - so if theList is an actual linked list - e.g. std::list<std::string> - then you cannot use std::sort with it.Instead you have to use the specially provided std::list-types' sort member functions.

Secondly you show code that presumes, and talk about, elements in theList being std::string instances but then show a set of elements that appear not to be strings but integers. As your compare function seems to only use part of a string and a space is used in determining these substrings (wrongly - see below) and you show your set of elements like so:

   7 , 5 , 3 , 2

I cannot determine which spaces are part of the strings and which parts of the question text. Please place such strings in quotes as per C/C++ literal string style to make such ambiguities clear, e.g. do you mean:

   "7" , "5" , "3" , "2"

or maybe:

   " 7 "," 5 ", " 3 ", " 2 "

You will I hope appreciate that the two (and more) possible sets of string values from the values you showed will in this case have an effect on _which_ substrings are compared.

Now as far as I can work out your compare function is trying to compare the first 'word' of each string passed to it against the other - specifically whether the first word of the first string is greater than the first word of the second. Point: the operation functions passed to std::sort are binary predicates - that is they take two values and return a true/false Boolean value. Why then does your function return int rather than bool?

Now to the part you have wrong:

Oddly while the std::string find functions return index position values, the std::string substr functions take start position and _length_ of substring to return. They do not as you are doing here take start position and end position. Also remember that the find functions can return std::string::npos if the searched for item(s) is/are not found.

Now if no sort criteria (i.e. binary predicate function) are given to std::sort then std::less is used which returns true if item 1 is less than item 2, which in the case of the sequence of values "7" , "5" , "3" , "2", will yield the sequence "2", "3", "5", "7".

If then we reduce your compare function to:

   bool compare(const std::string& s1, const std::string& s2)
       return s1 > s2;

Then the sequence "7" , "5" , "3" , "2" would yield "7" , "5" , "3" , "2", that is the values are _already_ in the required order.

As to which values in the sequence std::sort presents to the comparison predicate function (and in what order) - well that depends on the exact sort algorithm(s) employed (the C++ standard guarantees that on average n-log-n comparisons performance and that std::sort not necessarily stable, see http://en.wikipedia.org/wiki/Sorting_algorithm for some possible sort algorithms that fulfil these criteria). In short your compare function will be called with values where ever in a sort algorithm you see some sort of comparison on element values. For example if we look at the merge sort algorithm given at http://en.wikipedia.org/wiki/Merge_sort, specifically the initial pseudo code for the merge_sort function then we see the line:

   if left.last_item > right.first_item

In this case, as we work with less rather than greater as the default we would be using:

   if right.first_item < left.last_item

And in the case of actually requiring a less than comparison we would expect something equivalent to:

   if ( std::less(right.first_item, left.last_item) )

To appear in the sort algorithm code.

In the case of providing your compare function then we would end up with something akin to:

   if ( compare(right.first_item, left.last_item) )

In fact it would probably look more like:

   if ( fn(right.first_item, left.last_item) )

Where fn is the function parameter passed to the std::sort algorithm.

You might like to look at the Wikipedia pages for some of the other search algorithms for more examples.

Looking at a real example in the Microsoft Visual C++ 2008 standard library I see that this implementation seems to use a combination of quick sort, heap sort and insertion sort depending on data set size, and the first use of the predicate comparison function is during the quick sort divide and conquer logic in an internal function for sorting the median of three values - fist, mid, last - where, in the case of compare, it would be called with the values:

   compare( mid, first );
   compare( last, mid );
   compare( mid, first);

If any results are true then those values are swapped. Are you any the wiser? No, neither am I - other than to appreciate that this sort of thing occurs in well optimised library code, that it is not going to be that easy to reverse engineer and that if I'd revise the quick sort partitioning logic it would probably make sense. If you really wish to then I suggest you read the contents of your std::sort function and follow around the logic - doing so would probably be instructive for you. You might like to try to obtain more than one implementation (e.g. for different C++ compilers, or maybe third party implementations such as STLport at http://www.stlport.org/).

As to your use of std::binary_search - first, again, std::binary_search returns a bool value so why use an int?

The binary search algorithm works on sorted data sets. For best results the algorithm requires random access to the data sets - but the std::binary_search implementation will work with just a forward iterator (one that can only move forward one element at a time).

Once again std::less is used if no operation function is provided. Once again the operation is a binary predicate that takes two values and returns a bool.

In this case one of the values will be the value you pass to std::binary_search. The other will be one of the elements in the sequence. A binary search starts in (approximately) the central element. It then compares the value of this element to the value being searched for. If it is not equal to the value searched for (a not before b and b not before a) then the next element checked is either at around position 1/4 or 3/4 of the number of elements, depending on whether the value searched for is before (comparison predicate returns true) or after  (comparison predicate returns false, and switching the arguments returns true). Again the result from the tests is used to determine whether the value has been found or weather to search nearer the start of the sequence or nearer the end. At each step the change in position is halved  thus the steps are (starting at the begriming) : +(number of elements)/2, +/- (number of elements)/4, +/- (number of elements)/8... until the step size is less than 0.5 which leads to some complexity when working with integer index and offset values; the +/- means plus or minus - i.e, add or subtract the amount to the current position. Here is a simple example using std::less (as your compare function seem broken I see little point using it for explanatory purposes):

Given a sequence of sorted integers using less than as the is before criteria:

 1 3 5 7 9 11 13 15 17 19 21

In which we have 11 values.

We are looking to see if 19 is in the (sorted) sequence. To take care of the 0.5 requirement we will use a roundabout way of adding the 0.5 to integers. We want to do this:

  truncate(n / d + 0.5)
Instead we shall do this using integer operations:

 (2*n/d + 1)/2

That is we multiply the whole expression inside the truncate by 2 then divide result down by 2 again.

We have 11 elements so 2*n is 22.

The first element we shall look at is (22/2 + 1)/2 = 6, counting along we see this has the value 11:

   19 < 11 is false so 19 is not before 11
   11 < 19 is true so 11 is before 19 (or 19 is after 11)

Hence we need to increase our position. We add (22/4+1)/2 = 3 to our position, which moves us to the element at position 9 from the start, which is 17

   19 < 17 is false so 19 is not before 17
   17 < 19 is true so 17 is before 19 (or 19 is after 17)

Hence we need to increase out position. We add (22/8+1)/2 = 1 to our position which moves us to the element at position 10 from the start, which is 19

   19 (searched for value) < 19 (element value) is false so 19 is not before 19
   19 (element value) < 19 (searched for value) is false so 19 is not before 19

Hence we have found the element we are looking for, return true.

Now let's search for 23:

The first element we shall look at again is (22/2 + 1)/2 = 6, the element with the value 11:

   23 < 11 is false so 23 is not before 11
   11 < 23 is true so 11 is before 23 (or 23 is after 11)

Hence we need to increase out position. Adding (22/4+1)/2 = 3 to our position, which moves us again to the element at position 9 from the start, which is 17

   23 < 17 is false so 23 is not before 17
   17 < 23 is true so 17 is before 23 (or 23 is after 17)

Hence we need to increase out position. Adding (22/8+1)/2 = 1 to our position moves us again to the element at position 10 from the start, which is 19

   23 < 19 is false so 23 is not before 19
   19 < 23 is true so 19 is before 23 (or 23 is after 19)

Hence we need to increase out position. Adding (22/16+1)/2 = 1 to our position moves us to the element at position 11 from the start, which is 21

   23 < 21 is false so 23 is not before 21
   21 < 23 is true so 21 is before 23 (or 23 is after 21)

Hence we need to increase out position. We would want to add (22/32+1)/2 = 0 to our position, but the change is zero so we reached the end of the search without locating the value we were looking for and false is returned.

Notionally, where you see a less than comparison std::less is called [as a function is required, the operator < cannot be used directly, so it is wrapped, as with other operators, in a function (template) std::less]. In fact an implementation may provide a totally separate function (and related internal functions) for the default case and use the < operator directly.

To see how it would work with your compare function just replace the sequence of values with those from your sorted sequence, then replace the parts of above of the form "v1 < v2 is" with "compare(v1, v2) returns".

However I think you need to fix the bugs in compare first.

Hope this has given you some insight into the sorts of things going on.  


All Answers

Answers by Expert:

Ask Experts


Ralph McArdell


I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.


My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.


©2017 About.com. All rights reserved.