Fastest way to extract differing positions from each alignment in a BAM file
Hi,
What would be the fastest way (I have to do this hundreds of millions times) to extract for each aligned read in a BAM file:
1) The positions where the read bases differ from a reference sequence.
2) The PHRED base quality values of these bases. If the difference is an indel, the quality value will, of course, be skipped.
As far as I know, I cannot use mpileup or anything I know of due to memory limitation as this is a very custom amplicon reference analysis, with >500 million coverage per base position on the reference amplicon.
In short, I need to apply an efficient approach to extract all differing positions for each aligned read.
Thanks.
Last edited by CHRYSES; 12-14-2011 at 06:52 AM.
|