You are here:

Unix/Linux OS/Not able to remove duplicates using uniq command


Dear Sir,

Removing duplicates using sort and then uniq command is not working on my file, which contains blacklisted urls.For eg.

The command
uniq input.txt > output.txt results in:

whereas, I want output:

Can you plz suggest how to remove these duplicates (ip addresses or integer values)??

Thanks in advance.


There must be some "whitespace" causing uniq to consider them to be different...
I would guess that some of the entries came from a Windows type environment and some from *nix...
So, some lines have ^M in them...

Try this to remove carriage return, tab and space characters first:

tr -d "[\r\t ]" < input.txt | sort | uniq > output.txt

If that doesn't work, you need to look at the content of input.txt using something like "hd" or "od":
hd input.txt | less
And identify the character(s) causing the problem.
Once you know which character, you can add it to the list of characters to delete in the "tr" command.

Good Luck!

Unix/Linux OS

All Answers

Answers by Expert:

Ask Experts




Expert: Creating and managing *nix database/application servers for use with dl4/unibasic/mysql/apache/thoroughbred applications, especially in medical environments. Strengths: scripting, backup and disaster recovery, mysql, apache2, routing, samba/smbfs/cifs, LPRng, CUPS, telnet/ssh/sftp, vsftp, rsync, new system preparation, system duplication, database design, system conversions (AIX/SCO-OS5/Linux) Currently working on scripted setup of LAMP servers using PDO for MySQL and Oracle. Compiling Apache2, openssl, php and libxml2 from source and linking to libraries for MySQL and Oracle InstantClient. Works great so far! Familiar With: php, c, awk, sed, gnome, nfs and lots of other *nix tools


I've been head of development at our company since 1984. Our OS's at that time were Point 4's IRIS and Altos' Xenix. Then: SCO Xenix, SCO Unix, AIX, SCO-OS5, Caldera, RedHat 7, Debian Sarge, RedHat ES4, Debian Etch, Redhat ES5, Debian Lenny, RedHat ES6, Debian Squeeze. I've migrated our clients through those various versions with minimal interruption while preserving their investments in hardware and staff knowledge over time.

1980 BSBA Washington University, Saint Louis, Missouri

©2017 All rights reserved.