awk - counting the number of residues in a file -
awk - counting the number of residues in a file -
i have file follows. count number of each character.
>1dmla mtdspggvapaspvedasdaslgqpeegapcqvvlqgaelngilqafaplrtslldsllvmgdrgilihntifgeqvflp lehsqfsryrwrgptaaflslvdqkrsllsvfranqypdlrrvelaitgqapfrtlvqriwtttsdgeavelasetlmkr eltsfvvlvpqgtpdvqlrltrpqltkvlnatgadsatpttfelgvngkfsvfttstcvtfaareegvssststqvqils naltkagqaaanaktvygenthrtfsvvvddcsmravlrrlqvgggtlkfflttpvpslcvtatgpnavsavfllkpqk >1dmlb ddvaarlraagfgavgagataeetrrmlhrafdtla >2bhdc mtdspggvapaspvedasdaslgqpeegapcqvvlqgaelngilqafaplrtslldsllvmgdrgilihntifgeqvflp lehsqfsryrwrgptaaflslvdqkrsllsvfranqypdlrrvelaitgqapfrtlvqriwtttsdgeavelasetlmkr eltsfvvlvpqgtpdvqlrltrpqltkvlnatgadsatpttfelgvngkfsvfttstcvtfaareegvssststqvqils
i tried next code.
awk '/^>/ { res=substr($0, 2); } /^[^>]/ { print res " - " length($0); }' <file
the output of above code is
1dmla - 80 1dmla - 80 1dmla - 80 1dmla - 79 1dmlb - 36 2bhdc - 80 2bhdc - 80 2bhdc - 80
my desired output is
1dmla - 319 1dmlb - 36 2bhdc - 240
how alter above code getting desired output?
here's 1 way using awk
:
awk '/^>/ && r { print r, "-", s; r=s="" } /^>/ { r = substr($0, 2); next } { s += length } end { print r, "-", s }' file
results:
1dmla - 319 1dmlb - 36 2bhdc - 240
awk
Comments
Post a Comment