SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat Error: Error: segment-based junction search failed with err =-6 sjnewhouse RNA Sequencing 8 03-19-2013 04:14 AM
tophat error papori RNA Sequencing 0 02-04-2012 02:06 PM
Bow-tie inspect returned an error pettervikman Bioinformatics 8 07-13-2011 12:41 AM
tophat error muzz56 Bioinformatics 0 02-24-2011 10:43 AM
TopHat (v1.2.0) Error RockChalkJayhawk RNA Sequencing 0 01-25-2011 02:24 PM

Reply
 
Thread Tools
Old 02-09-2012, 03:24 AM   #1
papori
Senior Member
 
Location: berd

Join Date: Dec 2010
Posts: 179
Default Tophat - Error: gtf_to_fasta returned an error.

Hi all,
i am trying to use tophat with annotation file.
i am working on zv9 annotations from UCSC.
i fixed the original gtf file to match the first column in it to reference sequence in the bowtie index.
for example:
GTF - (2 first lines)
#chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds score name2 cdsStartStat cdsEndStat exonFrames
chr1 + 50321633 50410568 50322024 50393582 11 50321633,50323684,50327722,50376641,50384688,50384995,50387281,50388021,50392530,50393547,50409289, 50322231,50323751,50327850,50376774,50384782,50385109,50387444,50388129,50392579,50393588,50410568, 0 lef1 cmplcmpl 0,0,1,0,1,2,2,0,0,1,-1,

REFERENCE - (2 first lines)
>chr1
TTCTTCTGGGGAAAGTCTGATTTGATTTATTTCCCTTTTAAGATCAATATTATTAGCCCC

when i execute tophat without the GTF it all run well.
now i am having this error:
Error: gtf_to_fasta returned an error.

My command:
nohup ./tophat -r 430 -p 10 -z 0 -G ../annotation /mnt/FILE/index/zvgenome ../ex1/R1_001.fastq ../ex1/R2_001.fastq &

Does anyone familiar with this?

Best,
Pap
papori is offline   Reply With Quote
Old 02-09-2012, 03:29 AM   #2
papori
Senior Member
 
Location: berd

Join Date: Dec 2010
Posts: 179
Default

this is all the output

[Thu Feb 9 06:59:47 2012] Beginning TopHat run (v1.4.0)
-----------------------------------------------
[Thu Feb 9 06:59:47 2012] Preparing output location ./tophat_out/
[Thu Feb 9 06:59:47 2012] Checking for Bowtie index files
[Thu Feb 9 06:59:47 2012] Checking for reference FASTA file
[Thu Feb 9 06:59:47 2012] Checking for Bowtie
Bowtie version: 0.12.7.0
[Thu Feb 9 06:59:47 2012] Checking for Samtools
Samtools Version: 0.1.18
[Thu Feb 9 06:59:47 2012] Generating SAM header for /mnt/FILE/index/zvgenome
format: fastq
quality scale: phred33 (default)
[Thu Feb 9 06:59:51 2012] Reading known junctions from GTF file
Warning: TopHat did not find any junctions in GTF file
[Thu Feb 9 06:59:51 2012] Preparing reads
left reads: min. length=101, count=21379580
right reads: min. length=101, count=21310206
[Thu Feb 9 07:08:54 2012] Creating transcriptome data files..
[FAILED]
Error: gtf_to_fasta returned an error.
papori is offline   Reply With Quote
Old 04-04-2012, 01:03 PM   #3
lakshmaa
Member
 
Location: Boston

Join Date: Jun 2010
Posts: 11
Default

I get the same error and I working with zv9(zebra Fish genome) . Can anyone please help me with this?
lakshmaa is offline   Reply With Quote
Old 05-18-2012, 09:21 AM   #4
yingzhang
Junior Member
 
Location: Minneapolis

Join Date: Feb 2012
Posts: 9
Default

I don't think your GTF file is in the right format.

According to UCSC, GTF file contains 9 column:
<seqname> <source> <feature> <start> <end> <score> <strand> <frame> [attributes]



Quote:
Originally Posted by papori View Post
Hi all,
i am trying to use tophat with annotation file.
i am working on zv9 annotations from UCSC.
i fixed the original gtf file to match the first column in it to reference sequence in the bowtie index.
for example:
GTF - (2 first lines)
#chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds score name2 cdsStartStat cdsEndStat exonFrames
chr1 + 50321633 50410568 50322024 50393582 11 50321633,50323684,50327722,50376641,50384688,50384995,50387281,50388021,50392530,50393547,50409289, 50322231,50323751,50327850,50376774,50384782,50385109,50387444,50388129,50392579,50393588,50410568, 0 lef1 cmplcmpl 0,0,1,0,1,2,2,0,0,1,-1,

REFERENCE - (2 first lines)
>chr1
TTCTTCTGGGGAAAGTCTGATTTGATTTATTTCCCTTTTAAGATCAATATTATTAGCCCC

when i execute tophat without the GTF it all run well.
now i am having this error:
Error: gtf_to_fasta returned an error.

My command:
nohup ./tophat -r 430 -p 10 -z 0 -G ../annotation /mnt/FILE/index/zvgenome ../ex1/R1_001.fastq ../ex1/R2_001.fastq &

Does anyone familiar with this?

Best,
Pap
yingzhang is offline   Reply With Quote
Old 05-30-2012, 09:04 AM   #5
paula123
Junior Member
 
Location: new delhi

Join Date: May 2012
Posts: 3
Default Genome file of Entamoeba in GTF format

Hi,
I am working with entamoeba histolytica data. I need entamoeba histolytica reference genome data in GTF format. I got the file in genebank format but unable to find out in GTF format. If any one can provide me the appropriate link, I would be very grateful.
paula123 is offline   Reply With Quote
Old 11-15-2012, 02:52 AM   #6
havard
Junior Member
 
Location: Oslo, Norway

Join Date: Apr 2010
Posts: 2
Default

Hi,

I had the same problem, but think I have solved it now. I believe the error occurs because the fasta file name is different from the index files and/or gtf file. So if your index and gtf base is Danio_rerio. then your fasta file should be Danio_rerio.fa.
havard is offline   Reply With Quote
Old 09-03-2013, 05:30 PM   #7
xfh
Member
 
Location: China

Join Date: Jan 2011
Posts: 26
Default

hi, i also have that problem. here the chromosome name is same between index and gtf. the file name of index, fa, gtf is hg18_ref. anyone can help me?
xfh is offline   Reply With Quote
Old 09-03-2014, 02:15 PM   #8
tulipnandu
Junior Member
 
Location: Dallas

Join Date: Feb 2013
Posts: 4
Lightbulb Tophat problem gtf to fasta

Many have faced the same problem. Actually I just overcame the problem. Follow the steps and see if you can too.
  1. 1.Go on the following link and select the genome you want to download. In my case I downloaded the mm10 mouse genome UCSC. (http://cufflinks.cbcb.umd.edu/igenomes.html)
  2. 2. Unzip the file. You will see mm10/Annotation mm10/Sequence. These folders inside them have all the files required for the tophat run. Just make sure the paths while running the tophat command are directed to them.
  3. 3.Here is the code I used:
    tophat -p 8 --keep-fasta-order --no-coverage-search --library-type fr-firststrand -G Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2014-05-23-16-05-10/Genes/genes.gtf --transcriptome-index Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes -g 10 --output-dir shP1_4hr_n1 Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome *.fastq.gz

In the above case the archive has the UCSC genes.gtf file which already has the chr annotation to it and the gene names. Make sure you don't rename those files. Also then the output file to the transcriptome index has to be something like Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes , I don't know somehow that worked. Then the index files are in the Sequence/Bowtie2Index folder, you can also use the bowtie1 Index file. Last is the input.

Hope this helps. If it doesn't let me know and I can help you further.

Tulip.
tulipnandu is offline   Reply With Quote
Old 09-03-2014, 02:16 PM   #9
tulipnandu
Junior Member
 
Location: Dallas

Join Date: Feb 2013
Posts: 4
Default

Many have faced the same problem. Actually I just overcame the problem. Follow the steps and see if you can too.
  1. 1.Go on the following link and select the genome you want to download. In my case I downloaded the mm10 mouse genome UCSC. (http://cufflinks.cbcb.umd.edu/igenomes.html)
  2. 2. Unzip the file. You will see mm10/Annotation mm10/Sequence. These folders inside them have all the files required for the tophat run. Just make sure the paths while running the tophat command are directed to them.
  3. 3.Here is the code I used:
    tophat -p 8 --keep-fasta-order --no-coverage-search --library-type fr-firststrand -G Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2014-05-23-16-05-10/Genes/genes.gtf --transcriptome-index Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes -g 10 --output-dir shP1_4hr_n1 Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome *.fastq.gz

In the above case the archive has the UCSC genes.gtf file which already has the chr annotation to it and the gene names. Make sure you don't rename those files. Also then the output file to the transcriptome index has to be something like Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes , I don't know somehow that worked. Then the index files are in the Sequence/Bowtie2Index folder, you can also use the bowtie1 Index file. Last is the input.

Hope this helps. If it doesn't let me know and I can help you further.

Tulip.
tulipnandu is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO