when I run TopHat 2.0.6 to align my RNA-Seq to dm3, it failed when reached the tophat_reports section, the running log is:
[2013-03-23 05:29:05] Reporting output tracks
[FAILED]
Error running tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir tophat-out/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 --bowtie1 -z gzip --gtf-annotations data/genome/drosophila/dm3_r5_flyBaseGene.gff --gtf-juncs tophat-out/tmp/dm3_r5_flyBaseGene.juncs --no-closure-search --no-coverage-search --no-microexon-search --library-type fr-unstranded --sam-header tophat-out/tmp/dm_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 data/genome/bowtie_indexes/dm.fa tophat-out/junctions.bed tophat-out/insertions.bed tophat-out/deletions.bed tophat-out/fusions.out tophat-out/tmp/accepted_hits tophat-out/tmp/left_kept_reads.m2g.bam,tophat-out/tmp/left_kept_reads.m2g_um.mapped.bam tophat-out/tmp/left_kept_reads.bam
Loaded 51134 junctions
It seems the accepted_hits*.bam is incompelete, since I run with samtools returned:
samtools sort accepted_hits0.bam sort
[bam_header_read] EOF marker is absent. The input is probably truncated.
but when I try to align other data to hg19, TopHat works well for 4 datasets.
I compared the running log between hg19 and dm3, it seems no insertions, deletions and junctions found in dm3, and all the left_kept_reads.m2g_um.candidates*.bam are not included in tophat_reports
arguments list.
By the way, all my data are 100bp single-end.
I have checked the TopHat Python source code, but the tophat_reports is executable binary program, so I don't know what did in it.
Any suggestions?
[2013-03-23 05:29:05] Reporting output tracks
[FAILED]
Error running tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir tophat-out/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 --bowtie1 -z gzip --gtf-annotations data/genome/drosophila/dm3_r5_flyBaseGene.gff --gtf-juncs tophat-out/tmp/dm3_r5_flyBaseGene.juncs --no-closure-search --no-coverage-search --no-microexon-search --library-type fr-unstranded --sam-header tophat-out/tmp/dm_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/bin/samtools --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 data/genome/bowtie_indexes/dm.fa tophat-out/junctions.bed tophat-out/insertions.bed tophat-out/deletions.bed tophat-out/fusions.out tophat-out/tmp/accepted_hits tophat-out/tmp/left_kept_reads.m2g.bam,tophat-out/tmp/left_kept_reads.m2g_um.mapped.bam tophat-out/tmp/left_kept_reads.bam
Loaded 51134 junctions
It seems the accepted_hits*.bam is incompelete, since I run with samtools returned:
samtools sort accepted_hits0.bam sort
[bam_header_read] EOF marker is absent. The input is probably truncated.
but when I try to align other data to hg19, TopHat works well for 4 datasets.
I compared the running log between hg19 and dm3, it seems no insertions, deletions and junctions found in dm3, and all the left_kept_reads.m2g_um.candidates*.bam are not included in tophat_reports
arguments list.
By the way, all my data are 100bp single-end.
I have checked the TopHat Python source code, but the tophat_reports is executable binary program, so I don't know what did in it.
Any suggestions?