You are here:

Unix/Linux OS/Not able to remove duplicates using uniq command

Advertisement


Question
Dear Sir,

Removing duplicates using sort and then uniq command is not working on my file, which contains blacklisted urls.For eg.

10.44.56.78
10.44.56.78
10.wrs.org
10.wrs.org
baby.us
baby.us
zym.com
zym.com
.
.
.
So...on

The command
uniq input.txt > output.txt results in:

10.44.56.78
10.44.56.78
10.wrs.org
10.wrs.org
baby.us
zym.com
.
.
.
So...on

whereas, I want output:

10.44.56.78
10.wrs.org
baby.us
zym.com
.
.
.
So...on

Can you plz suggest how to remove these duplicates (ip addresses or integer values)??

Thanks in advance.

Answer
Hi!

There must be some "whitespace" causing uniq to consider them to be different...
I would guess that some of the entries came from a Windows type environment and some from *nix...
So, some lines have ^M in them...

Try this to remove carriage return, tab and space characters first:

tr -d "[\r\t ]" < input.txt | sort | uniq > output.txt

If that doesn't work, you need to look at the content of input.txt using something like "hd" or "od":
hd input.txt | less
And identify the character(s) causing the problem.
Once you know which character, you can add it to the list of characters to delete in the "tr" command.

Good Luck!
Larry

Unix/Linux OS

All Answers


Answers by Expert:


Ask Experts

Volunteer


mkitwrk

Expertise

Expert: Creating and managing *nix database/application servers for use with dl4/unibasic/mysql/apache/thoroughbred applications, especially in medical environments. Strengths: scripting, backup and disaster recovery, mysql, apache2, routing, samba/smbfs/cifs, LPRng, CUPS, telnet/ssh/sftp, vsftp, rsync, new system preparation, system duplication, database design, system conversions (AIX/SCO-OS5/Linux) Currently working on scripted setup of LAMP servers using PDO for MySQL and Oracle. Compiling Apache2, openssl, php and libxml2 from source and linking to libraries for MySQL and Oracle InstantClient. Works great so far! Familiar With: php, c, awk, sed, gnome, nfs and lots of other *nix tools

Experience

I've been head of development at our company since 1984. Our OS's at that time were Point 4's IRIS and Altos' Xenix. Then: SCO Xenix, SCO Unix, AIX, SCO-OS5, Caldera, RedHat 7, Debian Sarge, RedHat ES4, Debian Etch, Redhat ES5, Debian Lenny, RedHat ES6, Debian Squeeze. I've migrated our clients through those various versions with minimal interruption while preserving their investments in hardware and staff knowledge over time.

Education/Credentials
1980 BSBA Washington University, Saint Louis, Missouri

©2016 About.com. All rights reserved.