File comparisons using awk: Match columns
File3
a c e
File4
a 1 b 2 c 3 d 4 e 5
the one liner for comparing the first field of file4 with the first field of file3 is:
awk 'FNR==NR{a[$0];next}($1 in a)' file3 file4
and the output is:
a 1 c 3 e 5
And if you want to remove the lines which match just change the above mentioned command by adding a !
awk 'FNR==NR{a[$0];next}!($1 in a)' file3 file4
Can you please explain how it is working ?
ReplyDeleteawk 'FNR==NR{a[$0];next}($1 in a)' file3 file4
ReplyDeleteFNR->line number of the file.
NR->line number of all collected data of all the files.
So the first thing is:
FNR==NR->this condition will be a succes untill all the lines in the first file are completed processing.As soon as all the lines in the file3 are completed,FNR will be set back to 1 and NR will continue with its numbering.
So untill this condition satisfies the array a keeps on building with $0(which is the complete line of file3 here).So at the end of file3 the array has all the lines of file3.
next is like continue in c language it will tell awk to start processing the next line.
The rest of the code ($1 in a) will applied only after all the lines in file3 are completed(that is from first line of file4).$1 represents the first field of file4.
($1 in a) will check whether ther is a $1 as a key in the array a.If success this will print the line
I want to cmpare two files columnwise in unix using shell script
ReplyDeletefile1
datasrid BMStrid Mersionid country curr
Met_CCD V14121011081 Recent US USD
Met_CCD V14121011082 Recent US USD
Met_CCD V14121011083 Recent GB GDB
Met_CCD V14121011084 Recent IE GDB
Met_CCD V14121011085 Recent GB GDB
Met_CCD V14121011086 Recent AU AUD
Met_CCD V14121011086 Recent HK HKD
Met_CCD V14121011087 Recent IE GDB
file2
datasrid BMStrid Mersionid country curr
Met_CCD V14121011081 Recent US USD
Met_CCD V14121011082 Recent US USD
Met_CCD V14121011083 Recent GB GDB
Met_CCD V14121011088 Recent IE GDB
Met_CCD V14121011085 Recent HK GDB
Met_CCD V14121011086 Recent AU AUD
Met_CCD V14121011086 Recent HK HKD
Met_CCD V14121011087 Recent IE GDB
Outputfile
need to compare file2 wrt file1.
change in any cell should get highlighted in output file.
like
o/p file should contain
Met_CCD 'V14121011088' Recent IE GDB
Met_CCD V14121011085 Recent 'HK' GDB