Counting duplicates in a file
I have a file where column one has a list of family identifiersAB AB AB AB SAR SAR EARIs there a way that I can create a new column where each repeat is numbered creating a new label for each repeat i.e.
AB_1 AB_2 AB_3 AB_4 SAR_1 SAR_2 EAR_1Below is a pretty simple solution for this:
awk '{print $1"_"++a[$1]}' fileSince the hash map a has all the counters for all the duplicates. You can use that in the END block if you wish to see just the counters.
0 comments: