Column wise comaprision using awk

29.3.13
There was a requirement once where i need to compare two files not with thier rows instead  i needed to do the comparision with columns.
I wanted only those rows where any of the columns in the lines differ.

for eg:
File1
1 A B C D
2 E F G H
File2
1 A Z C D
2 E F Y H
3 M N O P
Below is the Output I need:
file1 1 col2 B
file2 1 col2 Z
file1 2 col3 G
file2 2 col3 Y

Below is the solution in awk that i have written.
awk 'FNR==NR{a[FNR]=$0;next} {
if(a[FNR])
{split(a[FNR],b);
for(i=1;i<=NF;i++)
{
if($i!=b[i])
{
printf "file1 "b[1]" col"b[i-1]" "b[i]"\n";
printf "file2 "$1" col"b[i-1]" "$i"\n";
}
}
}
}'
Below is the test i made on my solaris server:


> nawk 'FNR==NR{a[FNR]=$0;next}{if(a[FNR]){split(a[FNR],b);for(i=1;i<=NF;i++){if($i!=b[i]){printf "file1 "b[1]" col"i-1" "b[i]"\n";printf "file2 "$1" col"i-1" "$i"\n";}}}}' file1 file2
file1 1 col2 B
file2 1 col2 Z
file1 2 col3 G
file2 2 col3 Y
>

Read more ...

Counting duplicates in a file

26.3.13
I have a file where column one has a list of family identifiers
AB
AB
AB
AB
SAR
SAR
EAR
Is there a way that I can create a new column where each repeat is numbered creating a new label for each repeat i.e.
AB_1
AB_2
AB_3
AB_4
SAR_1
SAR_2
EAR_1
Below is a pretty simple solution for this:
awk '{print $1"_"++a[$1]}' file
Since the hash map a has all the counters for all the duplicates. You can use that in the END block if you wish to see just the counters.
Read more ...

Converting single column to multiple columns

26.3.13
I have a file which contains all the entries in a single column like:
0
SYSCATSPACE
16384
13432
2948
1
1
TEMPSPACE1
1
1
applicable
1
2
USERSPACE1
4096
1888
2176
1
If I want to convert this in a tabular form of 3*6:
0 SYSCATSPACE 16384 13432 2948       1
1 TEMPSPACE1  1     1     applicable 1
2 USERSPACE1  4096  1888  2176       1
Below is the command that I will use:
perl -lne '$a.="$_ ";
           if($.%6==0){push(@x,$a);$a=""}
           END{for(@x){print $_}}' your_file
output would be :
> perl -lne '$a.="$_ ";if($.%6==0){push(@x,$a);$a=""}END{for(@x){print $_}}' temp
0 SYSCATSPACE 16384 13432 2948 1 
1 TEMPSPACE1 1 1 applicable 1 
2 USERSPACE1 4096 1888 2176 1
Read more ...

Capture all the letters which are repeated more than once

12.3.13
Recently i came across a need where i need to fetch all the letters in a line which are repeated more than once in a line contiguously.

for example :

lets say there a word "foo bar". I want the letter 'o' in this.
lets say there a word "foo baaar". I want the letters 'o','a' in this.
lets say there a word "foo baaar foo". I want the letters 'o','a' in this again.

Below is code which worked for me:

perl -lne 'push @a,/(\w)\1+/g;END{print @a}' your_file

The above command will scan complete file and prints just the letters that are repeated continguously in the file.
Read more ...