Seqanswers Leaderboard Ad

**Nicolas** · 01-19-2012, 08:00 AM

I think that is a novelty of Tophat 1.4.
You don't need to provide the file "Homo_sapiens.GRCh37.65.fa" but it is generated from Genome Index (in your case: /home/RefGenome/hg19/Homo_sapiens_assembly19_sorted) and from the GTF file (which does not need to get the same name as the Bowtie index)

Did you check if the chromosomes are labelled the same in both files?

**Jon_Keats** · 01-19-2012, 11:08 AM

Your GTF looks to be from ensembl, not sure of your genome maybe UCSC is my guess. This results in a problem often as UCSC uses chr1 while ensembl just 1. Best not to mix and match data sources when you can avoid it.

**kenietz** · 01-30-2012, 01:45 AM

Hi i get the same error. I have the bowtie index from fasta manually built and i have a gff3 file as well. i used bowtie-inspect --names to get the names from the index and renamed all entries in the first column of the gff3 file. still i get error during the building of the index from the fasta file. but why should be built again when is already built?
I have the names in both files in this format:
gi|240254421|ref|NC_003070.9| Arabidopsis thaliana chromosome 1, complete sequence

**kenietz** · 01-30-2012, 05:42 PM

Hi again,
i think is an error in the tophat script file. it is expecting to have the fasta file for indexing in the output directory which doesnt make sense.here is the line from run.log:

/opt/bowtie-0.12.7/bowtie-build ./tophat_out/tmp/A_thaliana_rg.fa ./tophat_out/tmp/A_thaliana_rg

**ruiN** · 02-25-2012, 02:51 PM

I am stuck here too. But I don't think it's an error in the script. Apparently if you supply a GFF Tophat will call Bowtie to re-index using a new fasta that it made from that GFF. The new file will be placed in the temp folder. For some reason Bowtie-build can't open the fasta file it just created for me. I guess I'm just gonna try to go without the GFF for Tophat and see if that would work.

**julio514** · 04-17-2012, 09:41 AM

As Jon_Keats suggested, these identifiers have to be the same. I had the same issue and it was apparently caused by the fact that the chromosome identifiers where "1" in my gtf file whereas it was chr1 in my fasta file. I made the corrections and it works fine now.

**aforntacc** · 07-24-2013, 03:13 AM

Hello people, i am very new to tophat, bowtie and samtools
i read the manual of tophat and ran it on ubuntu vitualized on windows 7 and i got this error
[2013-07-20 04:11:39] Beginning TopHat run (v2.0.9)
-----------------------------------------------
[2013-07-20 04:11:39] Checking for Bowtie
Bowtie version: 2.1.0.0
[2013-07-20 04:11:39] Checking for Samtools
Samtools version: 0.1.19.0
[2013-07-20 04:11:39] Checking for Bowtie index files (genome)..
[2013-07-20 04:11:39] Checking for reference FASTA file
Warning: Could not find FASTA file seq.fa
[2013-07-20 04:11:39] Reconstituting reference FASTA file from Bowtie index
Executing: /usr/bin/bowtie2-inspect seq > ./tophat_out/tmp/seq.fa
[2013-07-20 04:11:45] Generating SAM header for seq
format: fastq
quality scale: phred33 (default)
[2013-07-20 04:11:51] Preparing reads
left reads: min. length=100, max. length=100, 63588062 kept reads (424 discarded)
right reads: min. length=100, max. length=100, 63000645 kept reads (587841 discarded)
[2013-07-20 05:17:15] Mapping left_kept_reads to genome seq with Bowtie2
[2013-07-20 13:05:46] Mapping left_kept_reads_seg1 to genome seq with Bowtie2 (1/4)
[2013-07-20 13:59:24] Mapping left_kept_reads_seg2 to genome seq with Bowtie2 (2/4)
[2013-07-20 15:06:18] Mapping left_kept_reads_seg3 to genome seq with Bowtie2 (3/4)
[2013-07-20 15:56:59] Mapping left_kept_reads_seg4 to genome seq with Bowtie2 (4/4)
[2013-07-20 16:49:52] Mapping right_kept_reads to genome seq with Bowtie2
[2013-07-20 23:01:03] Mapping right_kept_reads_seg1 to genome seq with Bowtie2 (1/4)
[FAILED]
Error running bowtie:
Saw ASCII character -93 but expected 33-based Phred qual.
terminate called after throwing an instance of 'int'

please what should i do
thanks

**mastal** · 07-24-2013, 03:35 AM

Tophat 1.4.0 RNA seq mapping

What type of reads are you trying to align,
and what quality scale is used for the base qualities?

**aforntacc** · 07-24-2013, 03:47 AM

sorry for miss out this information

it is pair-end reads from illumina
quality scale is phred score

thanks a lot

**GenoMax** · 07-24-2013, 04:13 AM

Do you know what exact scale/encoding those Q-scores are using? http://en.wikipedia.org/wiki/FASTQ_format (5 encoding types)

That is important as Maria pointed out.

**aforntacc** · 07-24-2013, 04:22 AM

ok, from the report of the sequencing company (i dont know these people) it is greater than Q30
and in the tem log file i see this quality scale: phred33 (default)
thanks

**GenoMax** · 07-24-2013, 04:28 AM

Is that a -93 you are seeing in the error (or just 93) hard to tell from the original post? If the scale is Sanger (Phred33) then your raw sequence Q-scores should not have that value.

**aforntacc** · 07-24-2013, 06:03 AM

yes ur right it is -93
so what should i.
any ideas i will be very glad because now
i think i am stock

**GenoMax** · 07-24-2013, 06:48 AM

Looking at the time stamps in your log it appears that the error is thrown after ~18 h of run time. That is a pain ..

Can you try the script posted by Simon Andrews in post #8 to check for errors in your fastq files: http://seqanswers.com/forums/showthread.php?t=7784

Did you compile bowtie on this VM? Are you using 32-bit Ubuntu or 64-bit?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 51 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Tophat 1.4.0 RNA seq mapping

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News