![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
some problem with tophat | zslee | Bioinformatics | 6 | 04-04-2012 03:24 AM |
problem running TopHat | anecsulea | RNA Sequencing | 0 | 05-28-2010 01:44 AM |
tophat running problem | yasu | Bioinformatics | 4 | 02-08-2010 11:54 PM |
Novice Problem with Tophat | DrD2009 | Bioinformatics | 10 | 12-30-2009 01:24 PM |
Tophat problem | iloveneworleans | Bioinformatics | 0 | 07-15-2009 03:05 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Can anyone here help me with the tophat error?
when i map single-end solid sequences(fastq format) to hg18 as follow: tophat --solexa1.3-quals /usr/local/bowtie/indexes/hg18 s_1_1.fastq there is such error: [Thu Jan 14 09:28:00 2010] Beginning TopHat run (v1.0.10) ----------------------------------------------- [Thu Jan 14 09:28:00 2010] Preparing output location ./tophat_out/ [Thu Jan 14 09:28:00 2010] Checking for Bowtie index files [Thu Jan 14 09:28:00 2010] Checking for reference FASTA file [Thu Jan 14 09:28:00 2010] Checking for Bowtie Bowtie version: 0.10.1.0 [Thu Jan 14 09:28:00 2010] Checking reads seed length: 76bp format: fastq quality scale: --solexa1.3-quals Splitting reads into 3 segments [Thu Jan 14 09:52:04 2010] Mapping reads against hg18 with Bowtie [FAILED] Error: could not execute Bowtie Traceback (most recent call last): File "/usr/local/tophat-1.0.10/bin/tophat", line 1490, in ? sys.exit(main()) File "/usr/local/tophat-1.0.10/bin/tophat", line 1462, in main user_supplied_juncs) File "/usr/local/tophat-1.0.10/bin/tophat", line 1241, in spliced_alignment seg) File "/usr/local/tophat-1.0.10/bin/tophat", line 752, in bowtie exit(1) TypeError: 'str' object is not callable What could be wrong? can anyone help me? thanks in advance~~ |
![]() |
![]() |
![]() |
#2 |
Member
Location: St Louis, MO Join Date: Nov 2009
Posts: 27
|
![]()
Do you have bowtie in your PATH?
|
![]() |
![]() |
![]() |
#3 |
Member
Location: St Louis, MO Join Date: Nov 2009
Posts: 27
|
![]()
By the way, there are newer versions of both bowtie and tophat available for download and the authors have squashed a few bugs. Probably not relevant to your error, but worth having the latest.
|
![]() |
![]() |
![]() |
#4 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Yes, I have bowtie in my path.
I have run the test data and it works. The s_1_1.fastq is ~3G bytes, converted and joined from 120 seperate qseq.txt files using the perl script provided by the thread 'Conversion from ‘qseq.txt’ to ‘fastq’ format'. I did a quick test by converting and joining only 10 qseq.txt files and run in tophat and it also worked. But when I converted and joined all the 120 files, it shows the error above. Any suggestions? |
![]() |
![]() |
![]() |
#5 |
Member
Location: St Louis, MO Join Date: Nov 2009
Posts: 27
|
![]()
Hmm, I've never tried tophat with such large fastq files. The largest I've tried has been 1.5G. Maybe you should get in touch with Cole Trapnell, the guy who largely wrote Tophat, and see if there's a reason why it's choking on large input files. (Cole was very helpful via e-mail with some annotation problems I had in early versions of Tophat.)
|
![]() |
![]() |
![]() |
#6 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Thanks! I will try.
Just one question about reference hg18. I noticed that hg18.3.ebwt only has 4 kb, whereas other ebwt files have 300-800Mb. I downloaded the 2.7 GB UCSC hg18 and unziped it in windows. |
![]() |
![]() |
![]() |
#8 |
Member
Location: St Louis, MO Join Date: Nov 2009
Posts: 27
|
![]()
Yes, I can confirm that your .3.ebwt file is OK. I have a bunch of bowtie indexes for mouse (self-built from Ensembl databases) and the .3 file is always a few kb only.
|
![]() |
![]() |
![]() |
#9 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
It looks like either the index or the fastq file has a problem.
Any way to check the hg18 index file and the fastq file? My fastq file is converted from qseq.txt by first replacing all the '.' to 'N', then use the perl script quoted as above. Do I need to filter the bad quality/ambiguous sequence before I feed it the to tophat? |
![]() |
![]() |
![]() |
#11 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Hi Xi Wang,
Thanks a lot for your help. If you are also doing human mRNA sequencing, do you know how long does it take for TopHat to finish analyzing 1 sample? What's the minimum hardware set up for reasonable speed? Currently I am running through a RedHat linux server and the speed is painfully slow. For only 1/6 of the total data for 1 sample, it hasn't been finished over this weekend since middle day of Friday. And I am aiming to analyze 20-40 samples in the near future. Do you think it is possible that I can open a few connections to the Linux server and run TopHat in seperate windows simultaneously? |
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]()
Hi,
I am also doing human mRNA mapping. It takes about 4-5 hours to map ~20 million reads to the human reference genome (hg18). Some paramters will affect the mapping efficiency, such as read length (our data is of 50nt), number of mismatches, number of multi-aligned loci allowed. How may reads do you have for one sample? I can't understand why it took so long to deal with a sample. Sure, you can run Tophat in seperate windows simultaneously.
__________________
Xi Wang |
![]() |
![]() |
![]() |
#14 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Hi,
Thanks a lot for your information. I only know my fastq file for 1 sample is around 3 GB after converting and joining all the 120 qseq.txt files, not sure how to find out how many reads in total? How do you know? The read length is 76 bp. I am running tophat with the default configuration without any argument except --solexa1.3-quals. I guess you are designating the number of mismatches, number of multi-aligned loci by the argument. If that's the case, what number do you use? PS. I am running TopHat through univ connection to the Linux server. Is it supposed to be faster than running on my local computer? How many processors do you have in your computer? Is a normal PC enough? |
![]() |
![]() |
![]() |
#15 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Another question:
Is there any need to run Bowtie alone as TopHat will call Bowtie anyway? |
![]() |
![]() |
![]() |
#16 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Another check:
My hg18.fa constructed from hg18 UCSC from TopHat website has the size of 3131776827 bytes, same as yours? |
![]() |
![]() |
![]() |
#17 | |||
Senior Member
Location: MDC, Berlin, Germany Join Date: Oct 2009
Posts: 317
|
![]() Quote:
Quote:
Quote:
__________________
Xi Wang |
|||
![]() |
![]() |
![]() |
#20 |
Member
Location: Australia Join Date: Jun 2009
Posts: 34
|
![]()
Hi Xi,
Thanks a lot for your help. Last night I managed to run 1 sample by tophat successfully (it took 17 hours). I tried to visualize the output, coverage.wig and junctions.bed in UCSC genome browser. When I load coverage.wig, it shows Error File 'coverage.wig' - Error line 3771557 of custom track: chromEnd less than 1 (0) When I load junctions.bed, it only shows chromosome 20? Name Description Type Doc Items Pos junctions TopHat junctions bed 80207 chr20: By the way, my junctions file is 6430K and my coverage file is around 173M, are these normal? How do you do your visualization? Do you use Cufflinks to quantify the expression after TopHat? |
![]() |
![]() |
![]() |
Thread Tools | |
|
|