SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat memory usage during "Searching for junctions via segment mapping" biznatch RNA Sequencing 9 02-18-2013 09:47 AM
cufflinks "waiting for 1 threads to complete" letusgo Bioinformatics 6 01-17-2013 11:07 AM
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? elgor Illumina/Solexa 0 06-27-2011 07:55 AM
yet another "Error: segment-based junction search failed with err = -9" liux Bioinformatics 1 08-24-2010 09:48 AM
Startup "Knome" offers complete genome sequence for $350k! ECO Personalized Genomics 4 03-22-2010 07:12 PM

Reply
 
Thread Tools
Old 10-17-2012, 10:06 AM   #1
Sniwells
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 7
Default Tophat2 "joining segment hits" does not complete

Tophat2 runs nicely (6 hours) up to the step "joining segment hits". But this step (single core process, named "long_spanning_reads") is running now for almost one week.
data: 454-sequencing reads up to 500 nucleotides long probably containing a lot of exon-exon junctions.
Tophat version 2.0.0
Bowtie version: 2.0.0.6
here is the command:
tophat2 -p 12 -o tophat_out genome 454_data.fastq.gz

Does anyone has an idea what could cause this? Or maybe someone knows which parameter can be adjusted to reduce the time for this step.

Thanks

Last edited by Sniwells; 10-17-2012 at 11:20 AM.
Sniwells is offline   Reply With Quote
Old 10-18-2012, 05:41 AM   #2
prios
Junior Member
 
Location: Barcelona

Join Date: Jun 2012
Posts: 4
Default

Hello Sniwells,

I'm having the same problem here. I've 12 libraries from 2 runs (454) and Tophat gets stuck in that step. The most annoying thing is that in some libraries Tophat did a quick alignment (around 5-10h) but for 3 of them it took 1 week to complete, and I'm still waiting for last 4 to complete (more than 12 days).

I've not found any topics related to this problem in this forum and neither have received any answers from the authors of Tophat. Since the libraries were have a very similar amount of reads and 454 does not seem to be the most popular choice for RNA-seq, my thought is that this might have to do with the length of the reads (which in 454 data is way bigger than in Illumina's).

Anybody's got a clue?
prios is offline   Reply With Quote
Old 10-18-2012, 06:34 AM   #3
HSV-1
Member
 
Location: asia

Join Date: Jul 2012
Posts: 38
Default

did you put "--no-coverage-search"? if not it will take very long time.
HSV-1 is offline   Reply With Quote
Old 10-22-2012, 06:39 AM   #4
Sniwells
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 7
Default

Right after your suggestion I started tophat with the following command:
tophat2 --no-coverage-search -p 12 -o tophat_out genome 454_data.fastq.gz
But it seems as if this parameter does not solve this problem, because tophat2 is stucked at the same point since the day of of your post.
Maybe tophat is not designed for long 454 reads?
Sniwells is offline   Reply With Quote
Old 10-22-2012, 05:33 PM   #5
HSV-1
Member
 
Location: asia

Join Date: Jul 2012
Posts: 38
Default

at which step?
when tophat is writing segment, junction files, it will take a few days or even a week.

Quote:
Originally Posted by Sniwells View Post
Right after your suggestion I started tophat with the following command:
tophat2 --no-coverage-search -p 12 -o tophat_out genome 454_data.fastq.gz
But it seems as if this parameter does not solve this problem, because tophat2 is stucked at the same point since the day of of your post.
Maybe tophat is not designed for long 454 reads?

Last edited by HSV-1; 10-22-2012 at 05:38 PM.
HSV-1 is offline   Reply With Quote
Old 10-24-2012, 01:54 AM   #6
Sniwells
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 7
Default

Quote:
Originally Posted by HSV-1 View Post
at which step?
The same step:
"Tophat2 runs nicely (6 hours) up to the step "joining segment hits". But this step (single core process, named "long_spanning_reads") is running now for almost one week."
Sniwells is offline   Reply With Quote
Old 10-24-2012, 02:01 AM   #7
HSV-1
Member
 
Location: asia

Join Date: Jul 2012
Posts: 38
Default

sort of normal.
be sure your que is tolerant for this comsumed time or it will be killed w/o accomplishment.

Quote:
Originally Posted by Sniwells View Post
The same step:
"Tophat2 runs nicely (6 hours) up to the step "joining segment hits". But this step (single core process, named "long_spanning_reads") is running now for almost one week."
HSV-1 is offline   Reply With Quote
Old 11-01-2012, 05:21 AM   #8
Sniwells
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 7
Default

Quote:
Originally Posted by HSV-1 View Post
sort of normal.
be sure your que is tolerant for this comsumed time or it will be killed w/o accomplishment.
The process is still running, (14 days). Let's see if there will be a happy end.
Sniwells is offline   Reply With Quote
Old 11-16-2012, 03:10 PM   #9
Sniwells
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 7
Default

I stopped the process after it was running for nearly a month.
Does anyone has run tophat with long 454-reads successfully?
Sniwells is offline   Reply With Quote
Old 11-16-2012, 04:48 PM   #10
HSV-1
Member
 
Location: asia

Join Date: Jul 2012
Posts: 38
Default

I didn't know your reads are from roche 454.
There is special protocol for long reads.
HSV-1 is offline   Reply With Quote
Old 01-17-2013, 06:54 AM   #11
prios
Junior Member
 
Location: Barcelona

Join Date: Jun 2012
Posts: 4
Default

Hello again!

I've finally managed to make Tophat2 work on my problematic 454 reads. What I did is splitting my original fastq file into several ones and run Tophat separately for each of them. Then take the sub-file that takes longer to finish, split it in sub-sub-files and run Tophat again on each of them.

After several rounds, I came across a single read that, if erased in the original fastq file, makes Tophat work smooth and fast.

I still don't know what makes those reads special as they are not the longest, nor the shortest, nor showing bad quality...

Anyway, hope it works.
Pablo
prios is offline   Reply With Quote
Reply

Tags
joining segment hits, long_spanning_reads, parameter, time, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:56 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO