It was around 2 Am and I was working like a caveman,but its hard to escape bed time 😦
Suddenly I found I set a wrong cron job in a cloud and it generated duplicate results.I have to make a report from the cron output and every line should be unique.The file is around 1.2 GB.
It was a json file, that has several thousand lines,many of them are redundant.I have to remove the redundant values and make a file which every line is unique.
I started to write a python script to do that,I was on the half way to finish my python script that takes file and create another file that contains uniqu elements from the input file.As I was too tired,thought I should do a search is there any unix command to this job.And found exactly what I needed 🙂
sort filename.txt | uniq
cat filename.txt | sort -u
If the input file contans:
Line 1 Line 2 Line 2 Line 3 Line 1 Line 3
The command generates
Line 1 Line 2 Line 3
And I just redirected the output of the command into a new file like below:
sort filename.txt | uniq > result.txt
Explanation of the command:
‘sort’ command lists all the lines by default alphabetically and ‘uniq’ command can eliminate or count duplicate lines in a pre sorted file.
You can also use sort and uniq in different situation, for details check following links:
These two utility command will help me to sleep early 🙂