SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Running problem with cuffdiff john_nl Bioinformatics 2 02-16-2012 11:21 AM
tophat Error running running 'prep_reads' victoryhe Bioinformatics 2 10-17-2011 04:53 AM
problem running TopHat anecsulea Bioinformatics 1 03-28-2011 03:47 AM
problem running TopHat anecsulea RNA Sequencing 0 05-28-2010 12:44 AM
a tophat running problem clariet Bioinformatics 1 04-12-2010 05:05 AM

Reply
 
Thread Tools
Old 02-08-2010, 01:04 AM   #1
yasu
Member
 
Location: Japan

Join Date: Jul 2009
Posts: 10
Question tophat running problem

Hi all,

I'm trying to run mRNA-seq for human by tophat (v1.0.12).
I succeeded to get proper output file in the preliminary dataset (first 100K reads from each .fq file). But I failed to get proper output in the real dataset (each contains ~17M reads).

I would appreciate any help you could give me with this.

Thanks in advance.

-Yasu

### preliminary_test ###

$ tophat -r 10 -p 8 -o tophat_hg19_test hg19 s_1_1.head4000000.fq,s_6_1.head4000000.fq,s_7_1.head4000000.fq s_1_2.head4000000.fq,s_6_2.head4000000.fq,s_7_2.head4000000.fq

[Mon Feb 8 13:31:03 2010] Beginning TopHat run (v1.0.12)
-----------------------------------------------
[Mon Feb 8 13:31:03 2010] Preparing output location tophat_hg19_test/
[Mon Feb 8 13:31:03 2010] Checking for Bowtie index files
[Mon Feb 8 13:31:03 2010] Checking for reference FASTA file
[Mon Feb 8 13:31:03 2010] Checking for Bowtie
Bowtie version: 0.11.3.0
[Mon Feb 8 13:31:03 2010] Checking reads
seed length: 43bp
format: fastq
quality scale: --phred33-quals
[Mon Feb 8 13:31:51 2010] Mapping reads against hg19 with Bowtie
[Mon Feb 8 13:34:24 2010] Joining segment hits
[Mon Feb 8 13:34:59 2010] Mapping reads against hg19 with Bowtie
[Mon Feb 8 13:37:30 2010] Joining segment hits
[Mon Feb 8 13:38:04 2010] Searching for junctions via segment mapping
[Mon Feb 8 13:44:59 2010] Retrieving sequences for splices
[Mon Feb 8 13:46:42 2010] Indexing splices
[Mon Feb 8 13:47:58 2010] Mapping reads against segment_juncs with Bowtie
[Mon Feb 8 13:48:47 2010] Joining segment hits
[Mon Feb 8 13:49:26 2010] Mapping reads against segment_juncs with Bowtie
[Mon Feb 8 13:50:15 2010] Joining segment hits
[Mon Feb 8 13:50:52 2010] Reporting output tracks
-----------------------------------------------
Run complete [00:33:58 elapsed]


### real_data ###

tophat -r 10 -p 8 -o tophat_hg19 hg19 s_1_1.fq,s_6_1.fq,s_7_1.fq s_1_2.fq,s_6_2.fq,s_7_2.fq

[Mon Feb 8 14:24:47 2010] Beginning TopHat run (v1.0.12)
-----------------------------------------------
[Mon Feb 8 14:24:47 2010] Preparing output location tophat_hg19/
[Mon Feb 8 14:24:47 2010] Checking for Bowtie index files
[Mon Feb 8 14:24:47 2010] Checking for reference FASTA file
[Mon Feb 8 14:24:47 2010] Checking for Bowtie
Bowtie version: 0.11.3.0
[Mon Feb 8 14:24:47 2010] Checking reads
seed length: 43bp
format: fastq
quality scale: --phred33-quals
[Mon Feb 8 14:39:23 2010] Mapping reads against hg19 with Bowtie
[Mon Feb 8 15:24:40 2010] Joining segment hits
[Mon Feb 8 15:35:38 2010] Mapping reads against hg19 with Bowtie
[Mon Feb 8 16:18:44 2010] Joining segment hits
[Mon Feb 8 16:18:44 2010] Searching for junctions via segment mapping
Warning: junction database is empty!
[Mon Feb 8 18:01:42 2010] Joining segment hits
[Mon Feb 8 18:11:38 2010] Joining segment hits
[Mon Feb 8 18:11:38 2010] Reporting output tracks
[FAILED]
Error: Report generation failed with err = 1
Traceback (most recent call last):
File "/bin/tophat", line 1518, in ?
sys.exit(main())
File "/bin/tophat", line 1490, in main
params.gff_annotation)
File "/bin/tophat", line 936, in compile_reports
exit(1)
TypeError: 'str' object is not callable
yasu is offline   Reply With Quote
Old 02-08-2010, 09:06 PM   #2
yasu
Member
 
Location: Japan

Join Date: Jul 2009
Posts: 10
Default

I add the report.log file from real_data (failed one).

### Real_data (reports.log) ###

tophat_reports v1.0.12
---------------------------------------
Error: cannot open map file for reading

#####################

Comparing with the run.log files from preliminary_test (succeeded one) and from real_data (failed one), "/bin/segment_juncs" doesn't work well.

Can somebody give me any help?

Thanks,

-Yasu
yasu is offline   Reply With Quote
Old 02-08-2010, 10:08 PM   #3
Cole Trapnell
Senior Member
 
Location: Boston, MA

Join Date: Nov 2008
Posts: 212
Default

The fact that TopHat thinks the seed length is 43bp is concerning. The default is 25, and it shouldn't be different unless you specified --segment-length, which you didn't. TopHat currently requires that FASTQ files have records where all of the nucleotides for each read appear on a single line. Same goes for the quality strings - all the quality characters need to be on one line. This is a limitation I haven't had time to fix yet. Can you verify that your FASTQ file is formatted this way?
Cole Trapnell is offline   Reply With Quote
Old 02-08-2010, 10:36 PM   #4
yasu
Member
 
Location: Japan

Join Date: Jul 2009
Posts: 10
Default

Thanks for your kind help!!

My fastq file is something like this. I omitted the sequence+position id from the line after "+". Does this make the things bad?

-Yasu

###########
@HWI-EAS368:1:1:9:316#0/1
CTGGATGATAACATTCCAGAAGATGACTCAGGTGTCCCCACCC
+
BB66AB9ACBB@BCBAAA><BBBAAB?BBB@@@BA?B@BB@AB
@HWI-EAS368:1:1:9:424#0/1
CTCCCTGCCAGATATCGAGGAGGTGAAAGACCAGAGCAGGAAC
+
BCBBB>?>@CBABBB;A877??.:<<B@;@@?=>?A>?6AAA?
@HWI-EAS368:1:1:9:1060#0/1
TGGATGGTTCAGGATAATCACCTGAGCAGTGAAGCCAGCTGCT
+
BBBBB=?@BBB?=@A9AA@CAA><<@:5>7?=A?A@?=A???@
@HWI-EAS368:1:1:9:410#0/1
CGGAGGCGGAGGCTTGGGTGCGTTCAAGATTCAGCTTCACCCG
+
AA9AAA=:A7'=7=?4+366=AA@:A>999B:=2,=>1014>7
@HWI-EAS368:1:1:9:807#0/1
CGAACATTTCTGGCCCCCAAGTGTCAGCCCATTCACGTAAAAA
+
BBBBBBC@@<:;6BC>:@2<B=BBB@7;BB=:C@799:BBB?%
@HWI-EAS368:1:1:9:405#0/1
TGTAAAGCCTGAAACAGCTGCCTGTGTGGGACTGAGATGCAGG
+
?=>B=AA@AB@AAB?88>@@@BB>?B>A<=>?A81<-<@@B@@
yasu is offline   Reply With Quote
Old 02-08-2010, 10:54 PM   #5
yasu
Member
 
Location: Japan

Join Date: Jul 2009
Posts: 10
Default

I added the '--segment-length 25', but the comment is still as follows;

[Tue Feb 9 16:47:40 2010] Checking reads
seed length: 43bp
format: fastq
quality scale: --phred33-quals

Did I go some wrong way?

-Yasu
yasu is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:55 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO