Seqanswers Leaderboard Ad

**papori** · 02-09-2012, 04:29 AM

this is all the output

[Thu Feb 9 06:59:47 2012] Beginning TopHat run (v1.4.0)
-----------------------------------------------
[Thu Feb 9 06:59:47 2012] Preparing output location ./tophat_out/
[Thu Feb 9 06:59:47 2012] Checking for Bowtie index files
[Thu Feb 9 06:59:47 2012] Checking for reference FASTA file
[Thu Feb 9 06:59:47 2012] Checking for Bowtie
Bowtie version: 0.12.7.0
[Thu Feb 9 06:59:47 2012] Checking for Samtools
Samtools Version: 0.1.18
[Thu Feb 9 06:59:47 2012] Generating SAM header for /mnt/FILE/index/zvgenome
format: fastq
quality scale: phred33 (default)
[Thu Feb 9 06:59:51 2012] Reading known junctions from GTF file
Warning: TopHat did not find any junctions in GTF file
[Thu Feb 9 06:59:51 2012] Preparing reads
left reads: min. length=101, count=21379580
right reads: min. length=101, count=21310206
[Thu Feb 9 07:08:54 2012] Creating transcriptome data files..
[FAILED]
Error: gtf_to_fasta returned an error.

**lakshmaa** · 04-04-2012, 01:03 PM

I get the same error and I working with zv9(zebra Fish genome) . Can anyone please help me with this?

**yingzhang** · 05-18-2012, 09:21 AM

I don't think your GTF file is in the right format.

According to UCSC, GTF file contains 9 column:
<seqname> <source> <feature> <start> <end> <score> <strand> <frame> [attributes]

Originally posted by papori View Post

Hi all,
i am trying to use tophat with annotation file.
i am working on zv9 annotations from UCSC.
i fixed the original gtf file to match the first column in it to reference sequence in the bowtie index.
for example:
GTF - (2 first lines)
#chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds score name2 cdsStartStat cdsEndStat exonFrames
chr1 + 50321633 50410568 50322024 50393582 11 50321633,50323684,50327722,50376641,50384688,50384995,50387281,50388021,50392530,50393547,50409289, 50322231,50323751,50327850,50376774,50384782,50385109,50387444,50388129,50392579,50393588,50410568, 0 lef1 cmplcmpl 0,0,1,0,1,2,2,0,0,1,-1,

REFERENCE - (2 first lines)
>chr1
TTCTTCTGGGGAAAGTCTGATTTGATTTATTTCCCTTTTAAGATCAATATTATTAGCCCC

when i execute tophat without the GTF it all run well.
now i am having this error:
Error: gtf_to_fasta returned an error.

My command:
nohup ./tophat -r 430 -p 10 -z 0 -G ../annotation /mnt/FILE/index/zvgenome ../ex1/R1_001.fastq ../ex1/R2_001.fastq &

Does anyone familiar with this?

Best,
Pap

**paula123** · 05-30-2012, 09:04 AM

Genome file of Entamoeba in GTF format

Hi,
I am working with entamoeba histolytica data. I need entamoeba histolytica reference genome data in GTF format. I got the file in genebank format but unable to find out in GTF format. If any one can provide me the appropriate link, I would be very grateful.

**havard** · 11-15-2012, 03:52 AM

Hi,

I had the same problem, but think I have solved it now. I believe the error occurs because the fasta file name is different from the index files and/or gtf file. So if your index and gtf base is Danio_rerio. then your fasta file should be Danio_rerio.fa.

**xfh** · 09-03-2013, 05:30 PM

hi, i also have that problem. here the chromosome name is same between index and gtf. the file name of index, fa, gtf is hg18_ref. anyone can help me?

**tulipnandu** · 09-03-2014, 02:15 PM

Tophat problem gtf to fasta

Many have faced the same problem. Actually I just overcame the problem. Follow the steps and see if you can too.

1.Go on the following link and select the genome you want to download. In my case I downloaded the mm10 mouse genome UCSC. (http://cufflinks.cbcb.umd.edu/igenomes.html)
2. Unzip the file. You will see mm10/Annotation mm10/Sequence. These folders inside them have all the files required for the tophat run. Just make sure the paths while running the tophat command are directed to them.
3.Here is the code I used:
tophat -p 8 --keep-fasta-order --no-coverage-search --library-type fr-firststrand -G Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2014-05-23-16-05-10/Genes/genes.gtf --transcriptome-index Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes -g 10 --output-dir shP1_4hr_n1 Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome *.fastq.gz

In the above case the archive has the UCSC genes.gtf file which already has the chr annotation to it and the gene names. Make sure you don't rename those files. Also then the output file to the transcriptome index has to be something like Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes , I don't know somehow that worked. Then the index files are in the Sequence/Bowtie2Index folder, you can also use the bowtie1 Index file. Last is the input.

Hope this helps. If it doesn't let me know and I can help you further.

Tulip.

**tulipnandu** · 09-03-2014, 02:16 PM

Many have faced the same problem. Actually I just overcame the problem. Follow the steps and see if you can too.

1.Go on the following link and select the genome you want to download. In my case I downloaded the mm10 mouse genome UCSC. (http://cufflinks.cbcb.umd.edu/igenomes.html)
2. Unzip the file. You will see mm10/Annotation mm10/Sequence. These folders inside them have all the files required for the tophat run. Just make sure the paths while running the tophat command are directed to them.
3.Here is the code I used:
tophat -p 8 --keep-fasta-order --no-coverage-search --library-type fr-firststrand -G Mus_musculus/UCSC/mm10/Annotation/Archives/archive-2014-05-23-16-05-10/Genes/genes.gtf --transcriptome-index Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes -g 10 --output-dir shP1_4hr_n1 Mus_musculus/UCSC/mm10/Sequence/Bowtie2Index/genome *.fastq.gz

In the above case the archive has the UCSC genes.gtf file which already has the chr annotation to it and the gene names. Make sure you don't rename those files. Also then the output file to the transcriptome index has to be something like Mus_musculus/UCSC/mm10/Annotation/Genes/transcriptome_index_bt2/genes , I don't know somehow that worked. Then the index files are in the Sequence/Bowtie2Index folder, you can also use the bowtie1 Index file. Last is the input.

Hope this helps. If it doesn't let me know and I can help you further.

Tulip.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Tophat - Error: gtf_to_fasta returned an error.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News