![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
tophat2 errors | ahmetz | Bioinformatics | 25 | 09-04-2013 07:24 AM |
tophat2 installion problem | IceWater | Bioinformatics | 1 | 05-09-2012 01:09 AM |
Tophat2: bam file has no quality sequences for a lot of reads | duhaimj | Bioinformatics | 4 | 04-28-2012 07:47 AM |
TopHat2 insertion bed outputs | Bukowski | Bioinformatics | 0 | 04-23-2012 01:29 AM |
tophat2 segment_juncs error: Error: segment-based junction search failed with err =-6 | hulan0@gmail.com | Bioinformatics | 1 | 04-16-2012 07:37 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]()
I ran tophat2 with bowite1 as dealing with color space reads. The command line I used was
Code:
tophat --bowtie1 --keep-tmp -o T34_tophat2 -p 8 --color --quals --library-type=fr-secondstrand --transcriptome-index=transcriptome/hg19_Ensemble.GRCh37_65 /home/xwang/data/hg 19/bowtie_index/hg19.color T34.csfasta T34.qual Code:
[2012-04-27 10:36:10] Beginning TopHat run (v2.0.0) ----------------------------------------------- [2012-04-27 10:36:10] Checking for Bowtie Bowtie version: 0.12.7.0 [2012-04-27 10:36:11] Checking for Samtools Samtools version: 0.1.17.0 [2012-04-27 10:36:11] Checking for Bowtie index files [2012-04-27 10:36:11] Checking for Bowtie index files [2012-04-27 10:36:11] Checking for reference FASTA file [2012-04-27 10:36:11] Generating SAM header for /home/xwang/data/hg19/bowtie_index/hg19.color format: fasta [2012-04-27 10:38:10] Reading known junctions from GTF file [2012-04-27 10:38:48] Preparing reads left reads: min. length=50, count=64422218 [2012-04-27 11:43:11] Using pre-built transcriptome index.. [2012-04-27 11:43:49] Mapping left_kept_reads against transcriptome hg19_Ensemble.GRCh37_65 with Bowtie [2012-04-27 12:11:41] Converting left_kept_reads.m2g to genomic coordinates (map2gtf) [2012-04-27 12:14:57] Resuming TopHat pipeline with unmapped reads [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from "T34_tophat2/tmp/left_kept_reads.m2g_um.fq". [2012-04-27 12:14:57] Reporting output tracks ----------------------------------------------- [2012-04-27 13:08:39] Run complete: 02:32:28 elapsed Any hints? Thanks.
__________________
Xi Wang |
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: Storrs, Connecticut, US Join Date: Apr 2012
Posts: 5
|
![]()
Hi,
I have very similar issue. I have used samtools and checked that every bam file could be opened without error. Could you solve your problem? Thanks, Saad |
![]() |
![]() |
![]() |
#4 |
Member
Location: USA Join Date: Apr 2009
Posts: 36
|
![]()
I have the same issue with tophat2, using bowtie2. Some reads have qualities, some just have "*" in the quality field. Here are 2 examples, 1st with no quality, 2nd with quality:
Code:
HWI-ST201:229:C07HGACXX:2:1306:5066:164732:1:N:0:ATCACG 321 1 10015 0 91M X 155260312 0 ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA * AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:91 YT:Z:UU NH:i:20 CC:Z:5 CP:i:10285 HI:i:18 HWI-ST201:229:C07HGACXX:2:1203:20609:127413:1:N:0:ATCACG 83 1 10129 3 51M1I6M1I6M1I25M = 10335 298 CCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCCTAACCCTAACCCTAACCCTAACCCT ?ABA<3?<?<<3DCB@B<DDCAA9,?CBA=5DDBB>;EDEB;7HHHED=JIGFA<GIIIHF?IJIIGHFJIHCFCJIGFHFIHFFFDJHFD AS:i:-24 XN:i:0 XM:i:0 XO:i:3 XG:i:3 NM:i:3 MD:Z:88 YT:Z:UU NH:i:2 CC:Z:= CP:i:10129 HI:i:0 I have also sent report to the tophat email - but wanted to share that you're not alone! |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]()
I had another run without mapping to a transcriptome but to the reference genome directly. Tophat2 ended up with a similar error:
Code:
fail to read the header from "T34_tophat2_genome/tmp/left_kept_reads_unmapped.fq". Code:
@39387 T13323133231032130303001010113000104313423130441340 +3_19_590_F3 AAA=A2.%(='81-5%&;%%51(.1)&',')-'5!3**!*,'+)!!)=!, @39398 T31202110130210003323123122331321034123433032442343 +3_19_1526_F3 (A=/5/A@>(.B=9)&BA@/=B>)>3B?)'*@??!)B-!&/8:2!!A<!' @39402 T30231202003222033021022010303030024203413010441343 +3_20_156_F3 @8(&2,9(-3731%:3*''783&6)8.1'-+)0-!408!(3%%+!!+(!3 @39403 T31130311333002111122221010023203034033432000441040 +3_20_203_F3 A7B>5A:?>BB;@4'3:A=+;6<3?51@>'<,A>!=53!,/.-/!!0=!2
__________________
Xi Wang |
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Storrs, Connecticut, US Join Date: Apr 2012
Posts: 5
|
![]()
I checked all bam/sam files in the tmp directory with samtools. It turns out that the file tmp.samheader.sam (and other sam files) cannot be opened with samtools, and it gives those exact error messages ([bam_header_read]... bad EOF etc.) that we see on screen.
I ran bowtie with the exact commands issued by Tophat (from the run.log file). Bowtie runs fine (with both sam and plain-text output), and the output is valid. But when this output is piped to fix_map_order (an internal utility of Tophat), Tophat tries to read this temp.samheader.sam file and breaks. Note: this file is created very early when you run Tophat. Getting frustrated, I am not using Tophat for now. I have created my own splice junction library (through RSEQtools library) and intend to use bowtie (or bfast or bwa) to align my reads with both the reference genome and this splice junction library. Last edited by saad0105050; 04-30-2012 at 12:13 PM. Reason: Typo in the tool name `RSEQtools' |
![]() |
![]() |
![]() |
#7 |
Registered Vendor
Location: MD Join Date: Feb 2012
Posts: 18
|
![]()
+1 to all of you:
I run this command: Code:
$TOPHAT -o $DEST -C -Q --bowtie1 -p 60 -r 200 --mate-std-dev 30 --report-secondary-alignments --report-discordant-pair-alignments --coverage-search --microexon-search --library-type fr-secondstrand --keep-tmp -z0 $BOWTiEIndex/human_g1k_v37_decoy "$SAMPLE"_F3.csfasta "$SAMPLE"_F5.csfasta "$SAMPLE"_F3_QV.qual "$SAMPLE"_F5_QV.qual Code:
[2012-04-30 11:48:52] Beginning TopHat run (v2.0.0) ----------------------------------------------- [2012-04-30 11:48:52] Checking for Bowtie Bowtie version: 0.12.7.0 [2012-04-30 11:48:52] Checking for Samtools Samtools version: 0.1.18.0 [2012-04-30 11:48:52] Checking for Bowtie index files [2012-04-30 11:48:52] Checking for reference FASTA file [2012-04-30 11:48:52] Generating SAM header for /home/biouml/galaxy/galaxy-tools-data/genomes/Hsapiens/hg19/bowtie_color//human_g1k_v37_decoy format: fasta [2012-04-30 11:49:32] Preparing reads left reads: min. length=50, count=28234582 right reads: min. length=35, count=28088955 [2012-04-30 12:19:10] Mapping left_kept_reads against human_g1k_v37_decoy with Bowtie [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from "tophat_out2/tmp/left_kept_reads_unmapped.fq". [2012-04-30 12:32:57] Mapping right_kept_reads against human_g1k_v37_decoy with Bowtie [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from "tophat_out2/tmp/right_kept_reads_unmapped.fq". Warning: junction database is empty! [2012-04-30 12:45:26] Processing bowtie hits [2012-04-30 13:06:25] Processing bowtie hits [2012-04-30 13:23:50] Reporting output tracks ----------------------------------------------- [2012-04-30 13:48:50] Run complete: 01:59:58 elapsed P.S. I sent all logs to developers, hope they will answer. |
![]() |
![]() |
![]() |
#8 | |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]() Quote:
My runs ended up with "left_kept_reads.m2g_um.fq", which was a FASTQ file, and I cannot understand at all why samtools tried to open a FASTQ file! It's ridiculous!
__________________
Xi Wang |
|
![]() |
![]() |
![]() |
#9 | |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]() Quote:
__________________
Xi Wang |
|
![]() |
![]() |
![]() |
#10 | |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]() Quote:
Code:
[2012-05-01 11:58:20] Beginning TopHat run (v2.0.0) ----------------------------------------------- [2012-05-01 11:58:20] Checking for Bowtie Bowtie version: 0.12.7.0 [2012-05-01 11:58:20] Checking for Samtools Samtools version: 0.1.17.0 [2012-05-01 11:58:20] Checking for Bowtie index files [2012-05-01 11:58:20] Checking for Bowtie index files [2012-05-01 11:58:20] Checking for reference FASTA file [2012-05-01 11:58:20] Generating SAM header for /home/xwang/data/hg19/bowtie_index/hg19.color format: fasta [2012-05-01 11:59:25] Reading known junctions from GTF file [2012-05-01 12:00:03] Preparing reads left reads: min. length=50, count=64422218 [2012-05-01 13:02:39] Using pre-built transcriptome index.. [2012-05-01 13:03:03] Mapping left_kept_reads against transcriptome hg19_Ensemble.GRCh37_65 with Bowtie [2012-05-01 13:30:59] Converting left_kept_reads.m2g to genomic coordinates (map2gtf) [2012-05-01 13:34:20] Resuming TopHat pipeline with unmapped reads [2012-05-01 13:34:20] Mapping left_kept_reads.m2g_um against hg19.color with Bowtie [2012-05-01 16:52:01] Mapping left_kept_reads.m2g_um_seg1 against hg19.color with Bowtie (1/2) [2012-05-01 20:09:31] Mapping left_kept_reads.m2g_um_seg2 against hg19.color with Bowtie (2/2) [2012-05-01 23:25:42] Searching for junctions via segment mapping
__________________
Xi Wang |
|
![]() |
![]() |
![]() |
#11 | |
Member
Location: Rockville Join Date: May 2009
Posts: 40
|
![]()
yes, I have the same problem, it has been running for two days (48 hours), and no file updates in the tmp folder for last 10 hours..it seems to be stopped...
did you fix this problem? Thanks Quote:
|
|
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]()
Yes, the running time for "segment_juncs" dealing with a large data set can be very slow. You may have a look at the logs folder, where up-to-date progress is recorded. I hadn't looked into this issue, but probably the developers should try to solve it out: fix the bug (if it is) or provide a new facility.
__________________
Xi Wang |
![]() |
![]() |
![]() |
#13 |
Member
Location: Spain Join Date: Mar 2009
Posts: 12
|
![]()
Hey guys,
I'm having the same problem here. I think it has to do with the Colorspace formated reads, since I can run TopHat with normal Illumina fastq files without errors but not with these kind of colorspace reads. It seems for some reason bowtie1/TopHat are trying to read a fastq file as if it were a bam file, and everything fails down from there. My temporary workaround will be to manually convert the colorspace reads to normal .fastq reads and map them with bowtie2 and against a normal index, since that should work. Here is hoping the TopHat guys will fix this downstream at some point. |
![]() |
![]() |
![]() |
#14 |
Member
Location: cambridge, MA Join Date: Dec 2012
Posts: 11
|
![]()
In case anyone is still struggling with this issue, I was able to get rid of this error by using a newer version of tophat (2.0.6). This is the call that I used (for single-end 50bp reads):
Code:
tophat --library-type fr-secondstrand --segment-length 25 --no-coverage-search --no-novel-juncs -G gencode.v14.annotation.gtf -o my_output_dir --color --bowtie1 --quals --transcriptome-index my_transcriptome_index hg19 "/unprotected/projects/lasvchal/moss/raw_data/my.csfasta" "/unprotected/projects/lasvchal/moss/raw_data/my_QV.qual" |
![]() |
![]() |
![]() |
Tags |
rna-seq, solid, tophat2 |
Thread Tools | |
|
|