Hi,
I have some troubles with bam files output from tophat and I hope someone here can help me.
1. I have finished running tophat (1.4.1) using UCSC hg19, WITHOUT "--no-sort-bam". But I found the resulting bam files are still sorted as chr1, chr10, chr12...chr2... Did I do anything wrong?
2. So, I can handle the above by using ReorderSam from Picard. But then when I tried to use RNA-SeQC to check the quality, it threw out an exception:
net.sf.picard.PicardException: Found 25613 unpaired mates
at net.sf.picard.sam.SamToFastq.doWork(SamToFastq.java:153)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:169)
at org.broadinstitute.cga.picardbased.CountAligned.getFastQ(CountAligned.java:310)
at org.broadinstitute.cga.picardbased.CountAligned.countBAM(CountAligned.java:115)
at org.broadinstitute.cga.rnaseq.ReadCountMetrics.alignAndCountrRNA(ReadCountMetrics.java:209)
at org.broadinstitute.cga.rnaseq.ReadCountMetrics.runReadCountMetrics(ReadCountMetrics.java:63)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.runMetrics(RNASeqMetrics.java:211)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:162)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:131)
So, I checked the bam file and indeed, I found some reads whose mates are not mapped. My question is, why does tophat output reads whose mates are not mapped? Can I simply remove these "unpaired mapped reads"? How will this affect my downstream analysis?
I will be grateful for any comments or suggestions! Thanks!!
I have some troubles with bam files output from tophat and I hope someone here can help me.
1. I have finished running tophat (1.4.1) using UCSC hg19, WITHOUT "--no-sort-bam". But I found the resulting bam files are still sorted as chr1, chr10, chr12...chr2... Did I do anything wrong?
2. So, I can handle the above by using ReorderSam from Picard. But then when I tried to use RNA-SeQC to check the quality, it threw out an exception:
net.sf.picard.PicardException: Found 25613 unpaired mates
at net.sf.picard.sam.SamToFastq.doWork(SamToFastq.java:153)
at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:169)
at org.broadinstitute.cga.picardbased.CountAligned.getFastQ(CountAligned.java:310)
at org.broadinstitute.cga.picardbased.CountAligned.countBAM(CountAligned.java:115)
at org.broadinstitute.cga.rnaseq.ReadCountMetrics.alignAndCountrRNA(ReadCountMetrics.java:209)
at org.broadinstitute.cga.rnaseq.ReadCountMetrics.runReadCountMetrics(ReadCountMetrics.java:63)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.runMetrics(RNASeqMetrics.java:211)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:162)
at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:131)
So, I checked the bam file and indeed, I found some reads whose mates are not mapped. My question is, why does tophat output reads whose mates are not mapped? Can I simply remove these "unpaired mapped reads"? How will this affect my downstream analysis?
I will be grateful for any comments or suggestions! Thanks!!