View Single Post
Old 01-14-2013, 04:31 AM   #1
Junior Member
Location: U.S.A.

Join Date: Nov 2012
Posts: 7
Default bam2fastq discarded reads

Hi all,

I've been using bam2fastq on my tophat output and it's been great, runs really quickly, except for the number of reads being discarded. For example this was for one of my output files from tophat

This looks like paired data from lane 239.
Output will be in x_1.fastq and x_2.fastq
60465861 sequences in the BAM file
60465861 sequences exported
WARNING: 6585459 reads could not be matched to a mate and were not exported

That's 10% of the reads being discarded, and in other files it's even more (I ran it on another file just now and 17% of the reads were discarded). What I don't understand is that the PE files which were put into tophat were quality filtered with software to directly handle PEs and so both files have the same number of reads and all the reads have mates and both PE files are the same order (tophat freaks out otherwise) so why is bam2fastq discarding these reads? If any reads didn't have a mate then tophat would have returned an error.

Last edited by Derek-C; 01-14-2013 at 04:47 AM.
Derek-C is offline   Reply With Quote