I am mapping some 2X100PE RNA-Seq data and BWA is mapping the second read very oddly for some of my samples.
I am testing out a mix of BWA and Tophat for mapping the reads and sometimes BWA will say two reads in a pair map to the same location even though the second read is filled with mismatches. I have attached an example IGV screenshot showing the BWA+Tophat mapped reads on the top and the Tophat only mapped reads on the bottom.
This happens for some of the samples, but not for others, even though the libraries were all prepared the same way. The coverage graphs on the top are identical, and since it is FPKM, there isn't likely to be much effect when it comes to differential expression analysis. Indeed, when comparing the sample mentioned above mapped with either BWA+Tophat or just Tophat, the R-squared is 0.9705.
I was planning on playing around with the BWA settings a bit, but I thought I'd pose the problem here to see if anyone had some suggestions.
I am testing out a mix of BWA and Tophat for mapping the reads and sometimes BWA will say two reads in a pair map to the same location even though the second read is filled with mismatches. I have attached an example IGV screenshot showing the BWA+Tophat mapped reads on the top and the Tophat only mapped reads on the bottom.
This happens for some of the samples, but not for others, even though the libraries were all prepared the same way. The coverage graphs on the top are identical, and since it is FPKM, there isn't likely to be much effect when it comes to differential expression analysis. Indeed, when comparing the sample mentioned above mapped with either BWA+Tophat or just Tophat, the R-squared is 0.9705.
I was planning on playing around with the BWA settings a bit, but I thought I'd pose the problem here to see if anyone had some suggestions.
Comment