You are here:

# C++/How to remove duplicates in a vector

Question
I am writing a program that is suppose to remove duplicates in a vector.  I am having trouble writing the remove_duplicates function.  I am suppose to go through all the elements and check to see if the same value occurs at a lower index.  If true, remove the element.

Here is my code so far..

#include <iostream>
#include <vector>

using namespace std;

void remove_element(int pos, vector<int>&a)
{
for(int i = pos; i < a.size() - 1; i++)
{
a[i] = a[i + 1];
a.pop_back();
}
}

void remove_duplicates(vector<int>& a,vector<int>& b)
{
for(int k = 0; k < a.size()-1; k++)
{
if(a[k] == a[k+1])
}

}

void printVector(vector<int> a)
{
for (int i = 0; i < a.size(); i++)
cout << a[i] << " ";
cout << "\n";
}

int main()
{
vector<int> empty;
vector<int> v1(9);
v1[0] = 1; v1[1] = 4; v1[2] = 9; v1[3] = 16;
v1[4] = 9; v1[5] = 7; v1[6] = 4; v1[7] = 9; v1[8] = 11;
vector<int> v2(7);
v2[0] = 11; v2[1] = 11; v2[2] = 7; v2[3] = 9; v2[4] = 16;
v2[5] = 4; v2[6] = 1;

remove_duplicates(empty);
printVector(empty);
printVector(v1);
remove_duplicates(v1);
printVector(v1);
printVector(v2);
remove_duplicates(v2);
printVector(v2);

}

Well your remove_duplicates function only checks for this element equalling the next element in the sequence not all other elements in the sequence.

It is also incorrect when k==a.size()-1, then element k+1 is accessed but there is _no_ element k+1, as elements are indexed from 0 to a.size()-1.

Your approach will require an inner loop, from k+1 to != a.size():

Your outer loop only needs to have k to go up to k != a.size() - 2 as the last element has no elements further along to check:

for( unsigned int k = 0; k != a.size()-2; ++k)
{
for ( unsigned int l = k; l != a.size()-1; ++l )
{
if ( a[l] == a[k] )
{
// remove the element...
}
}
}

Now the std::vector class template has an erase member that can erase a single element or a range of elements.

The C++ standard library also has an algorithm function template that can remove all instances of a specific value from a sequence of values. However it does not really remove the values - it just copies the values later along the sequence into the 'holes' left by the removed values and returns the position of the new supposed end. We can use this together with the erase member function of vector to 'remove' all the various duplicate elements from the vector then properly erase them by passing the new supposed end of the vector to erase as the first element to erase and the end of the vector as one-past-the-end to stop at (all sequence ranges are specified as first to one-past-last in C++ standard library ranges, they are also specified using things called iterators which can be thought of as positions or generalised pointers to the underlying elements).

Thus to remove and erase a single known value from all elements of a vector we would write something like so:

a.erase( std::remove(a.begin(), a.end(), the_value_to_remove), a.end() );

To see what is going on suppose the vector a contained the values:

3 5 2 7 6 4 5 9 1 5 0 :end:

where :end: represents a one-past-the-end position placeholder returned by a.end().

If the_value_to_remove == 5 then after executing:

std::remove(a.begin(), a.end(), the_value_to_remove)

all the elements having the value 5 will be removed and the other elements will be copied around to fill the holes viz:

3 2 7 6 4 9 1 0 :1: 5 0 :end:

The returned position will be where the new one-past-the-end end point should be - which will point to the element I showed between colons: :1:.

Passing this element's position and the current a.end() position to a.erase() will erase these now unneeded elements, thus after executing:

a.erase( std::remove(a.begin(), a.end(), the_value_to_remove), a.end() );

we have a vector with elements:

3 2 7 6 4 9 1 0 :end:

Now this is all very interesting I hear you say, if a bit over the top, but how does this help you to remove duplicates?

Well first the sequence range passed to remove does not have to start at the first element of a container ( such as. a.begin() ), nor finish at the end ( such as a.end() ).

Second we do not have to call erase after every use of remove - we can continue removing elements until we are done.

http://en.allexperts.com/q/C-1040/Delete-duplicate-elements-vector.htm

Of particular relevance might be the last part from:

"If sorting the vector is unacceptable then things are a little more complex...."

Hope this helps

Oh, and if you are using the C++ standard library containers and hopefully the algorithms that work with them, then you might find a decent book on the subject of use. For all the C++ standard library a decent general purpose reference and tutorial is "The C++ Standard Library a Tutorial and Reference" by Nicolai M. Josuttis.

C++

Volunteer

#### Ralph McArdell

##### Expertise

I am a software developer with more than 15 years C++ experience and over 25 years experience developing a wide variety of applications for Windows NT/2000/XP, UNIX, Linux and other platforms. I can help with basic to advanced C++, C (although I do not write just-C much if at all these days so maybe ask in the C section about purely C matters), software development and many platform specific and system development problems.

##### Experience

My career started in the mid 1980s working as a batch process operator for the now defunct Inner London Education Authority, working on Prime mini computers. I then moved into the role of Programmer / Analyst, also on the Primes, then into technical support and finally into the micro computing section, using a variety of 16 and 8 bit machines. Following the demise of the ILEA I worked for a small company, now gone, called Hodos. I worked on a part task train simulator using C and the Intel DVI (Digital Video Interactive) - the hardware based predecessor to Indeo. Other projects included a CGI based train simulator (different goals to the first), and various other projects in C and Visual Basic (er, version 1 that is). When Hodos went into receivership I went freelance and finally managed to start working in C++. I initially had contracts working on train simulators (surprise) and multimedia - I worked on many of the Dorling Kindersley CD-ROM titles and wrote the screensaver games for the Wallace and Gromit Cracking Animator CD. My more recent contracts have been more traditionally IT based, working predominately in C++ on MS Windows NT, 2000. XP, Linux and UN*X. These projects have had wide ranging additional skill sets including system analysis and design, databases and SQL in various guises, C#, client server and remoting, cross porting applications between platforms and various client development processes. I have an interest in the development of the C++ core language and libraries and try to keep up with at least some of the papers on the ISO C++ Standard Committee site at http://www.open-std.org/jtc1/sc22/wg21/.

Education/Credentials