AWK one-liner for multi-column comparision two unsorted files
Saturday, December 29, 2012 awk , compare 5 Comments

This awk one-liner works for multi-column on unsorted files. Comparision is done based on 1st,2nd,3rd,4th of the first file and 1st,3rd,6th,7th of the second file.
File1

7000,2,1,6
7001,2,1,7
7002,2,1,6
7003,1,2,1

File2

7000,john,2,0,0,1,6
7000,john,2,0,0,1,7
7000,john,2,0,0,1,8
7000,john,2,0,0,1,9
7001,elen,2,0,0,1,7
7002,sami,2,0,0,1,6
7003,mike,1,0,0,2,1
7003,mike,1,0,0,2,2
7003,mike,1,0,0,2,3
7003,mike,1,0,0,2,4
8001,nike,1,2,4,1,8
8002,paul,2,0,0,2,7

Output
awk -F, 'NR==FNR{a[$1,$2,$3,$4]++;next} (a[$1,$3,$6,$7])' File1 File2

7000,john,2,0,0,1,6
7001,elen,2,0,0,1,7
7002,sami,2,0,0,1,6
7003,mike,1,0,0,2,1

5 comments:

UnknownJanuary 28, 2014 at 9:55 AM
Hi Vijay, Thanks for the post.
Can you help me out in this. Actually, I have rows like

50121 abc.com 28/1/2014-12:00:00
52111 xyz.com 27/1/2014-12:00:00
deusr abc.com 26/1/2014-12:00:00
50121 abc.com 26/1/2014-12:00:00
52111 abc.com 25/1/2014-12:00:00

I removed duplicates based on first column and got the output as
50121 abc.com 28/1/2014-12:00:00
52111 xyz.com 27/1/2014-12:00:00
deusr abc.com 26/1/2014-12:00:00

but the issue here is I am willing to remove duplicates based on 2 columns comparison. i.e., 1st and 2nd one. I am trying using 'awk' command. But I am not getting it. Can you help me out in this please..
ReplyDelete
Replies
VijayJanuary 28, 2014 at 10:01 PM
@Bharghav, try the below:

awk '{a[$1,$2]=$0}END{for(i in a)print a[i]}' your_file
ReplyDelete
Replies
AnonymousJanuary 14, 2018 at 1:51 PM
Hi Vijay - in your original post the comparison of the 4 columns are based on AND condition.

If we need to use OR condition for comparing those columns, then what would be the AWK command for that?

Thanks
ReplyDelete
Replies
UnknownJune 13, 2018 at 2:03 AM
Hi Vijay,
Thank you for such a great posts.
Can you please help with the below requirement:

File1:
23455|abc|xyz
23455|abc|xsd
34433|wer|sad
45655|fdf|fcd

File2:
v343v|23455
z565z|23455
c9898|34433
b2323|45655

Output should be:
23455|abc|xyz|v343v
23455|abc|xyz|z565z
23455|abc|xsd|v343v
23455|abc|xsd|z565z
34433|wer|sad|c9898
45655|fdf|fcd|b2323

I am using the below command :
awk -F'|' 'NR==FNR { a[$1]=$1"|"$2"|"$3; next} $2 > 0 {print a[$2] "|" $1}' file1 file2 > result.txt

But its only showing the result :
23455|abc|xsd|v343v
23455|abc|xsd|z565z
34433|wer|sad|c9898
45655|fdf|fcd|b2323
ReplyDelete
Replies
raghuMay 29, 2019 at 4:04 AM
I Have two files as below,

first file:

Start State Next State Session Count Transition% LA_product_view home 694 28.660%

LA_product_view searchresults 54 2.230%

home 1101260 2 0.050%

second file:

Start State Next State Session Count Transition%

LA_product_view home 618 27.560%

LA_product_view searchresults 59 2.630%

home price 25 0.360%

i need to compare first two coloumns of first file and first two coloumns of second file and if they both match it should print 3,4 coloums of both file.if both the files has unique they should be listed with NA from the both files. my output should be like:

LA_product_view home 694 28.660% 618 27.560%

LA_product_view searchresults 54 2.230% 59 2.630%

home 1101260 2 0.050% N/A N/A

home price N/A N/A 25 0.360%

can someone please help me in this..!
ReplyDelete
Replies

Add comment

>The Unix Shell

AWK one-liner for multi-column comparision two unsorted files
Saturday, December 29, 2012 awk , compare 5 Comments

Vijay

5 comments:

Popular Posts

Facebook

About The Author

Labels

Search This Blog

Blog Archive

Pages

Gallery

Contact Us

Popular Posts

About us

Labels

AWK one-liner for multi-column comparision two unsorted files Saturday, December 29, 2012 awk , compare 5 Comments

Vijay

5 comments:

Popular Posts

Facebook

About The Author

Labels

Search This Blog

Blog Archive

Pages

Gallery

Contact Us

Popular Posts

AWK one-liner for multi-column comparision two unsorted files
Saturday, December 29, 2012 awk , compare 5 Comments