I'm running TopHat with the following options:
and getting an error at the very end. This only happens on 1 out 4 of my data files.
I can work around it by doing the steps below, so just posting in case others run into this.
tophat --keep-tmp --solexa1.3-quals --butterfly-search --min-intron-length 10 --max-intron-length 20000
[Fri Nov 26 02:19:27 2010] Beginning TopHat run (v1.1.4)
-----------------------------------------------
[Fri Nov 26 02:19:27 2010] Preparing output location ./tophat_out/
[Fri Nov 26 02:19:27 2010] Checking for Bowtie index files
[Fri Nov 26 02:19:27 2010] Checking for reference FASTA file
[Fri Nov 26 02:19:27 2010] Checking for Bowtie
Bowtie version: 0.12.7.0
[Fri Nov 26 02:19:27 2010] Checking for Samtools
Samtools version: 0.1.8.0
[Fri Nov 26 02:19:27 2010] Checking reads
min read length: 20bp, max read length: 80bp
format: fastq
quality scale: phred64 (reads generated with GA pipeline version >= 1.3)
[Fri Nov 26 02:24:48 2010] Mapping reads against merlin with Bowtie
[Fri Nov 26 02:35:18 2010] Joining segment hits
[Fri Nov 26 02:40:36 2010] Mapping reads against merlin with Bowtie(1/3)
[Fri Nov 26 02:46:23 2010] Mapping reads against merlin with Bowtie(2/3)
[Fri Nov 26 02:53:25 2010] Mapping reads against merlin with Bowtie(3/3)
[Fri Nov 26 02:53:29 2010] Searching for junctions via segment mapping
[Fri Nov 26 03:01:00 2010] Retrieving sequences for splices
[Fri Nov 26 03:01:01 2010] Indexing splices
[Fri Nov 26 03:01:06 2010] Mapping reads against segment_juncs with Bowtie
[Fri Nov 26 03:10:01 2010] Mapping reads against segment_juncs with Bowtie
[Fri Nov 26 03:20:37 2010] Mapping reads against segment_juncs with Bowtie
[Fri Nov 26 03:20:47 2010] Joining segment hits
[Fri Nov 26 03:22:15 2010] Reporting output tracks
Error: could not convert to BAM with samtools
-----------------------------------------------
[Fri Nov 26 02:19:27 2010] Preparing output location ./tophat_out/
[Fri Nov 26 02:19:27 2010] Checking for Bowtie index files
[Fri Nov 26 02:19:27 2010] Checking for reference FASTA file
[Fri Nov 26 02:19:27 2010] Checking for Bowtie
Bowtie version: 0.12.7.0
[Fri Nov 26 02:19:27 2010] Checking for Samtools
Samtools version: 0.1.8.0
[Fri Nov 26 02:19:27 2010] Checking reads
min read length: 20bp, max read length: 80bp
format: fastq
quality scale: phred64 (reads generated with GA pipeline version >= 1.3)
[Fri Nov 26 02:24:48 2010] Mapping reads against merlin with Bowtie
[Fri Nov 26 02:35:18 2010] Joining segment hits
[Fri Nov 26 02:40:36 2010] Mapping reads against merlin with Bowtie(1/3)
[Fri Nov 26 02:46:23 2010] Mapping reads against merlin with Bowtie(2/3)
[Fri Nov 26 02:53:25 2010] Mapping reads against merlin with Bowtie(3/3)
[Fri Nov 26 02:53:29 2010] Searching for junctions via segment mapping
[Fri Nov 26 03:01:00 2010] Retrieving sequences for splices
[Fri Nov 26 03:01:01 2010] Indexing splices
[Fri Nov 26 03:01:06 2010] Mapping reads against segment_juncs with Bowtie
[Fri Nov 26 03:10:01 2010] Mapping reads against segment_juncs with Bowtie
[Fri Nov 26 03:20:37 2010] Mapping reads against segment_juncs with Bowtie
[Fri Nov 26 03:20:47 2010] Joining segment hits
[Fri Nov 26 03:22:15 2010] Reporting output tracks
Error: could not convert to BAM with samtools
I can work around it by doing the steps below, so just posting in case others run into this.
# Fix the SO:sorted header
sed s/sorted/unsorted/ tophat_out/tmp/accepted_hits.sam > fixed.sam
# Use picard to fix clipping
java -jar picard-tools-1.35/CleanSam.jar INPUT=fixed.sam OUTPUT=fixed2.sam
# Convert to bam
java -jar picard-tools-1.35/SortSam.jar INPUT=fixed2.sam OUTPUT=accepted_hits.bam SORT_ORDER=coordinate
sed s/sorted/unsorted/ tophat_out/tmp/accepted_hits.sam > fixed.sam
# Use picard to fix clipping
java -jar picard-tools-1.35/CleanSam.jar INPUT=fixed.sam OUTPUT=fixed2.sam
# Convert to bam
java -jar picard-tools-1.35/SortSam.jar INPUT=fixed2.sam OUTPUT=accepted_hits.bam SORT_ORDER=coordinate
Comment