Unix sort & uniq

It was around 2 Am and I was working like a caveman,but its hard to escape bed time 😦

Suddenly I found I set a wrong cron job in a cloud and it generated duplicate results.I have to make a report from the cron output and every line should be unique.The file is around 1.2 GB.

It was  a json file, that has several thousand lines,many of them are redundant.I have to remove the redundant  values and make a file which every line is unique.

I started to write a python script to do that,I was on the half way to finish my python script that takes file and create another file that contains uniqu elements from the input file.As I was too tired,thought I should do a search is there any unix command to this job.And found exactly what I needed 🙂

sort filename.txt | uniq

Or

cat filename.txt | sort -u

If the input file contans:

Line 1
Line 2
Line 2
Line 3
Line 1
Line 3

The command generates

Line 1
Line 2
Line 3

And I just redirected the output of the command into a new file like below:

sort filename.txt | uniq > result.txt

Explanation of the command:

sort’ command lists all the lines by default alphabetically and ‘uniq’ command can eliminate or count duplicate lines in a pre sorted file.

You can also use sort and uniq in different situation, for  details check following links:

Sort and Uniq

These two utility command will help me to sleep early 🙂

Advertisements

One comment

  1. durga @ unix bash scripts forum · October 16, 2012

    Nice article on uniq command in unix

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s