Hi all,
I am using the SAMtools pileup format (created using 'samtools pileup -c') for SNP calling. I need to extract the frequency of each variation from the read base column of the pileup. However, there is one part of this column that I do not understand:
Would someone care to explain this in different words? I find in my pileup many examples of '^~.' (at first glance at loci with a coverage of a few reads) and '$', how do I interpret this?
Also, I have been trying to find an explanation of the CIGAR format. So far, my search has led me to Exonerate and the remark that apparently different versions of CIGAR exist. I was unable to find any other operations than M, D and I, let alone N, S or H. Is there some documentation on the CIGAR format, other than the exonerate man pages?
Thank you,
Wil
ps thanks for all the hard work on SAMtools and Picard!
I am using the SAMtools pileup format (created using 'samtools pileup -c') for SNP calling. I need to extract the frequency of each variation from the read base column of the pileup. However, there is one part of this column that I do not understand:
... Also at the read base column, a symbol ‘^’ marks the start of a read segment which is a contiguous subsequence on the read separated by ‘N/S/H’ CIGAR operations. The ASCII of the character following ‘^’ minus 33 gives the mapping quality. A symbol ‘$’ marks the end of a read segment.
Also, I have been trying to find an explanation of the CIGAR format. So far, my search has led me to Exonerate and the remark that apparently different versions of CIGAR exist. I was unable to find any other operations than M, D and I, let alone N, S or H. Is there some documentation on the CIGAR format, other than the exonerate man pages?
Thank you,
Wil
ps thanks for all the hard work on SAMtools and Picard!
Comment