View Single Post
Old 04-11-2012, 12:22 PM   #1
Junior Member
Location: New York

Join Date: Jan 2012
Posts: 9
Default Problems with Bam files from Tophat


I have some troubles with bam files output from tophat and I hope someone here can help me.

1. I have finished running tophat (1.4.1) using UCSC hg19, WITHOUT "--no-sort-bam". But I found the resulting bam files are still sorted as chr1, chr10, chr12...chr2... Did I do anything wrong?

2. So, I can handle the above by using ReorderSam from Picard. But then when I tried to use RNA-SeQC to check the quality, it threw out an exception:

net.sf.picard.PicardException: Found 25613 unpaired mates
at net.sf.picard.sam.SamToFastq.doWork(
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(
at org.broadinstitute.cga.picardbased.CountAligned.getFastQ(
at org.broadinstitute.cga.picardbased.CountAligned.countBAM(
at org.broadinstitute.cga.rnaseq.ReadCountMetrics.alignAndCountrRNA(
at org.broadinstitute.cga.rnaseq.ReadCountMetrics.runReadCountMetrics(
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.runMetrics(
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(

So, I checked the bam file and indeed, I found some reads whose mates are not mapped. My question is, why does tophat output reads whose mates are not mapped? Can I simply remove these "unpaired mapped reads"? How will this affect my downstream analysis?

I will be grateful for any comments or suggestions! Thanks!!
bjchen is offline   Reply With Quote