I am trying to modify the awk below to include the gene name ($5) for each target and can not seem to do so. Also, I'm not sure the calculation is right (average of all targets that are the same is $4 using the values in $7)? Thank you .
output.bam.hist.txt
epilepsy70_average.txt
Desired epilepsy70_average.txt
Code:
awk '{if(len==0){last=$4;total=$7;len=1;getline}if($4!=last){printf("%s\t%f\n", last, total/len);last=$4;total=$7;len=1}else{total+=$7;len+=1}}END{printf("%s\t%f\n", last, total/len)}' output.bam.hist.txt > epilepsy70_average.txt
Code:
chr1 40539722 40539865 chr1:40539722-40539865 PPT1 1 159 chr1 40539722 40539865 chr1:40539722-40539865 PPT1 2 161 chr1 40539722 40539865 chr1:40539722-40539865 PPT1 3 161
Code:
chr1:40539722-40539865 72.000000 chr1:40542503-40542595 46.500000 chr1:40544221-40544340 60.000000
Code:
chr1:40539722-40539865 72.000000 PPT1 chr1:40542503-40542595 46.500000 PPT1 chr1:40544221-40544340 60.000000 PPT1
Comment