Comparing two files using awk - An assignement

Monday, May 12, 2014 , 0 Comments

This is an awk assignment given to one of my friend. Its quite challenging. We have two files: File1:(List of companies)
Joe's Garage
Pip Co
Utility Muffin Research Kitchen
File2:(List of payments and dues of the companies in File1)
Pip Co                          $20.13   due
Pip Co                          $20.3   due
Utility Muffin Research Kitchen $2.56    due
Utility Muffin Research Kitchen 2.56    due
Joe's Garage                    $120.28  due
Joe's Garage                    $100.24 payment
Now the challenge is we need to create an output file which states the total amount due by each company. Additionally there is one more requirement where we need to handle the format errors in teh File2.
  1. The list of fomrat errors to be handled are:
  2. The dollor symbol not present in the amount
There should be exactly 2 decimals after the decimal point.
If any of the above format errors are encountered, then the complete line should be ignored and proceed to the next line.
The expected output here is:
Joe's Garage $20.04
Utility Muffin Research Kitchen $2.56
Pip Co $20.13
Below is the awk script that I have written for this. and its working at my side.
{
   if(FNR==NR)
   {
          for(i=1;i<=NF;i++)
          str=str","$i;
          a[str]=1;str="";
          next;
   }
   {
   if($(NF-1)!~/^\$/)
   {
   print "Format Error!-No dollor sign"FNR,FILENAME,$(NF-1);
   next;
   }
   if($(NF-1)!~/\.[0-9][0-9]$/)
   {
   print "Format Error!-should have 2 digits after a decimal point"FNR,FILENAME,$(NF-1);
   next;
   }
   for(i=1;i<(NF-1);i++)str=str","$i;
   if(a[str]){
   gsub(/\$/,"",$(NF-1));
   if($NF~/payment/){
     a[str]-=$(NF-1);}
   else if($NF~/due/){
     a[str]+=$(NF-1);}
   }
   str="";
  }
}
END{ 
   for(i in a)
   {
    t=i;
    gsub(/,/," ",t);
    print t,"$"(a[i]-1);
   }
}
I am sure that this can be optimized. I put it long so that its more convincing to all. Below is the way we have to execute this. I am using nawk on solaris.Others can use awk itself. Copy the above code in a file and name it as mycode.awk and then execute the awk command as below:
nawk -f mycode.awk File1 File2
Out that I have got with the above command is:
> nawk -f temp.awk temp2 temp1
Format Error!-should have 2 digits after a decimal point2 temp1 $20.3
Format Error!-No dollor sign4 temp1 2.56
 Joe's Garage $20.04
 Utility Muffin Research Kitchen $2.56
 Pip Co $20.13
>

0 comments: