SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Tophat: segment-based junction search failed with err =-11 jdanderson Bioinformatics 15 10-14-2017 06:47 AM
Tophat Error: Error: segment-based junction search failed with err =-6 sjnewhouse RNA Sequencing 8 03-19-2013 04:14 AM
yet another "Error: segment-based junction search failed with err = -9" liux Bioinformatics 1 08-24-2010 09:48 AM
Error: segment-based junction search failed with err = -9 Albert Cheng Bioinformatics 1 08-19-2010 08:24 AM
Tophat Segment-based junction error = -9 UNCKidney Bioinformatics 4 04-08-2010 07:29 AM

Reply
 
Thread Tools
Old 10-29-2011, 11:47 PM   #1
upadhyayanm
Junior Member
 
Location: Canberra, Australia

Join Date: Oct 2011
Posts: 3
Default Tophat error -segment-based junction search failled with err=1

Hi

Lately, I have a problem in running tophat 1.3.1 with a 100bp paired-end Illumina HiSeq RNA reads. After cleaning (quality trim, duplicate removal, adapter removal) I did split the files (taking care not to split the last entry sequence and quality scores) and fed to tophat . Please note, here I have more of left-kept reads because I have an extra file with leftover unpaired reads. Also, I have noticed with previous successful runs, eventhough the fed fastq paired read files have the same number of sequences what we see (in the log) as left-reads and right reads are slightly different.

Here is the log:

[Thu Oct 27 18:33:40 2011] Beginning TopHat run (v1.3.1)
-----------------------------------------------
[Thu Oct 27 18:33:40 2011] Preparing output location ./tophat_out/
[Thu Oct 27 18:33:40 2011] Checking for Bowtie index files
[Thu Oct 27 18:33:40 2011] Checking for reference FASTA file
[Thu Oct 27 18:33:40 2011] Checking for Bowtie
Bowtie version: 0.12.7.0
[Thu Oct 27 18:33:40 2011] Checking for Samtools
Samtools Version: 0.1.12a
[Thu Oct 27 18:33:40 2011] Generating SAM header for ../PG210SC5
[Thu Oct 27 18:33:40 2011] Preparing reads
format: fastq
quality scale: phred33 (default)
Left reads: min. length=50, count=134790672
Right reads: min. length=50, count=118121205
[Thu Oct 27 20:34:22 2011] Mapping left_kept_reads against PG210SC5 with Bowtie
[Thu Oct 27 21:42:15 2011] Processing bowtie hits
[Thu Oct 27 23:08:30 2011] Mapping left_kept_reads_seg1 against PG210SC5 with Bowtie (1/4)
[Fri Oct 28 00:27:19 2011] Mapping left_kept_reads_seg2 against PG210SC5 with Bowtie (2/4)
[Fri Oct 28 01:47:04 2011] Mapping left_kept_reads_seg3 against PG210SC5 with Bowtie (3/4)
[Fri Oct 28 02:57:47 2011] Mapping left_kept_reads_seg4 against PG210SC5 with Bowtie (4/4)
[Fri Oct 28 04:25:49 2011] Mapping right_kept_reads against PG210SC5 with Bowtie
[Fri Oct 28 05:26:52 2011] Processing bowtie hits
[Fri Oct 28 06:48:08 2011] Mapping right_kept_reads_seg1 against PG210SC5 with Bowtie (1/4)
[Fri Oct 28 08:00:12 2011] Mapping right_kept_reads_seg2 against PG210SC5 with Bowtie (2/4)
[Fri Oct 28 09:11:43 2011] Mapping right_kept_reads_seg3 against PG210SC5 with Bowtie (3/4)
[Fri Oct 28 10:21:22 2011] Mapping right_kept_reads_seg4 against PG210SC5 with Bowtie (4/4)
[Fri Oct 28 11:56:21 2011] Searching for junctions via segment mapping
[FAILED]
Error: segment-based junction search failed with err =1

____________________________________________________________________________________________

In the segment_juncs.log the last entry reads:

FZStream::rewind() popen(gzip -cd './tophat_out/tmp/left_kept_reads_seg1_missing.fq.z') failed


I have previously used such mixture of paired and unpaired reads successfully (I think!) with another set of reads. However, they were smaller read sets. Even with the above when I use only one pair out of four split files it works fine.

Appreciate if anyone can help me to resolve this problem.
upadhyayanm is offline   Reply With Quote
Old 12-30-2011, 11:43 AM   #2
canbruce
Junior Member
 
Location: CT, USA

Join Date: Dec 2011
Posts: 1
Default

I have the same problem. Were you able to figure out the reason for this error?

-canbruce
canbruce is offline   Reply With Quote
Old 01-02-2012, 12:55 AM   #3
upadhyayanm
Junior Member
 
Location: Canberra, Australia

Join Date: Oct 2011
Posts: 3
Default

Not yet. I suspect tophat is running out of memory. Although I am running it on a 48GB RAM Linux machine (Ubuntu OS) I think it is still not enough to handle such large inputs.
upadhyayanm is offline   Reply With Quote
Old 01-10-2012, 12:35 PM   #4
townway
Member
 
Location: Rockville

Join Date: May 2009
Posts: 40
Default

I had the same problem today, hope someone can stand out and point the way to fix.

My data directly output from illumina pipeline with two fastq files.
Quote:
Originally Posted by upadhyayanm View Post
Hi

Lately, I have a problem in running tophat 1.3.1 with a 100bp paired-end Illumina HiSeq RNA reads. After cleaning (quality trim, duplicate removal, adapter removal) I did split the files (taking care not to split the last entry sequence and quality scores) and fed to tophat . Please note, here I have more of left-kept reads because I have an extra file with leftover unpaired reads. Also, I have noticed with previous successful runs, eventhough the fed fastq paired read files have the same number of sequences what we see (in the log) as left-reads and right reads are slightly different.

Here is the log:

[Thu Oct 27 18:33:40 2011] Beginning TopHat run (v1.3.1)
-----------------------------------------------
[Thu Oct 27 18:33:40 2011] Preparing output location ./tophat_out/
[Thu Oct 27 18:33:40 2011] Checking for Bowtie index files
[Thu Oct 27 18:33:40 2011] Checking for reference FASTA file
[Thu Oct 27 18:33:40 2011] Checking for Bowtie
Bowtie version: 0.12.7.0
[Thu Oct 27 18:33:40 2011] Checking for Samtools
Samtools Version: 0.1.12a
[Thu Oct 27 18:33:40 2011] Generating SAM header for ../PG210SC5
[Thu Oct 27 18:33:40 2011] Preparing reads
format: fastq
quality scale: phred33 (default)
Left reads: min. length=50, count=134790672
Right reads: min. length=50, count=118121205
[Thu Oct 27 20:34:22 2011] Mapping left_kept_reads against PG210SC5 with Bowtie
[Thu Oct 27 21:42:15 2011] Processing bowtie hits
[Thu Oct 27 23:08:30 2011] Mapping left_kept_reads_seg1 against PG210SC5 with Bowtie (1/4)
[Fri Oct 28 00:27:19 2011] Mapping left_kept_reads_seg2 against PG210SC5 with Bowtie (2/4)
[Fri Oct 28 01:47:04 2011] Mapping left_kept_reads_seg3 against PG210SC5 with Bowtie (3/4)
[Fri Oct 28 02:57:47 2011] Mapping left_kept_reads_seg4 against PG210SC5 with Bowtie (4/4)
[Fri Oct 28 04:25:49 2011] Mapping right_kept_reads against PG210SC5 with Bowtie
[Fri Oct 28 05:26:52 2011] Processing bowtie hits
[Fri Oct 28 06:48:08 2011] Mapping right_kept_reads_seg1 against PG210SC5 with Bowtie (1/4)
[Fri Oct 28 08:00:12 2011] Mapping right_kept_reads_seg2 against PG210SC5 with Bowtie (2/4)
[Fri Oct 28 09:11:43 2011] Mapping right_kept_reads_seg3 against PG210SC5 with Bowtie (3/4)
[Fri Oct 28 10:21:22 2011] Mapping right_kept_reads_seg4 against PG210SC5 with Bowtie (4/4)
[Fri Oct 28 11:56:21 2011] Searching for junctions via segment mapping
[FAILED]
Error: segment-based junction search failed with err =1

____________________________________________________________________________________________

In the segment_juncs.log the last entry reads:

FZStream::rewind() popen(gzip -cd './tophat_out/tmp/left_kept_reads_seg1_missing.fq.z') failed


I have previously used such mixture of paired and unpaired reads successfully (I think!) with another set of reads. However, they were smaller read sets. Even with the above when I use only one pair out of four split files it works fine.

Appreciate if anyone can help me to resolve this problem.
townway is offline   Reply With Quote
Old 01-10-2012, 03:45 PM   #5
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Hi twonway, I am just wondering what is the amount of your data. How many reads you fed to Tophat?


Quote:
Originally Posted by townway View Post
I had the same problem today, hope someone can stand out and point the way to fix.

My data directly output from illumina pipeline with two fastq files.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 01-11-2012, 06:28 AM   #6
townway
Member
 
Location: Rockville

Join Date: May 2009
Posts: 40
Default

My data is around 200M reads from Hiseq one lane and I used 16 G memory to run Tophat 1.3.3 with coverage microexon butterfly search option.
Btw It worked well with old version of tophat

Quote:
Originally Posted by Xi Wang View Post
Hi twonway, I am just wondering what is the amount of your data. How many reads you fed to Tophat?
townway is offline   Reply With Quote
Old 01-11-2012, 07:34 AM   #7
biznatch
Senior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 124
Default

The butterfly search option uses a lot of memory. I'm pretty sure you'll need a lot more than 16GB memory to align 200M reads using that option. I have 16GB and ran out of memory trying to align ~30M 100bp PE reads with the butterfly option.
biznatch is offline   Reply With Quote
Old 01-11-2012, 07:46 AM   #8
townway
Member
 
Location: Rockville

Join Date: May 2009
Posts: 40
Default

yes that is true, without them, it works well now.

Quote:
Originally Posted by biznatch View Post
The butterfly search option uses a lot of memory. I'm pretty sure you'll need a lot more than 16GB memory to align 200M reads using that option. I have 16GB and ran out of memory trying to align ~30M 100bp PE reads with the butterfly option.
townway is offline   Reply With Quote
Old 01-11-2012, 02:24 PM   #9
Xi Wang
Senior Member
 
Location: MDC, Berlin, Germany

Join Date: Oct 2009
Posts: 317
Default

Tophat has updated to version 1.4.0 (BETA). Has anyone already tried this new version? As a big change in this new version, I think the strategy that Tophat maps reads to the transcriptome given by users first would be much stabler.
__________________
Xi Wang
Xi Wang is offline   Reply With Quote
Old 06-07-2012, 08:26 AM   #10
kesner
Member
 
Location: ma

Join Date: Apr 2012
Posts: 10
Default ReadStream::getRead() called with out-of-order id#!

I get a similar error. However, I get a different indication (see title).
After looking at the code, I think the error has to do with threading on multiple cores and Read_ids. In the section of the code I looked at, read_ids are handled distinctly for threaded and non-threaded code (I think). Am running latest (2.0.3). Am trying again without threading.
kesner is offline   Reply With Quote
Old 06-16-2012, 04:40 PM   #11
Auction
Member
 
Location: california

Join Date: Jul 2009
Posts: 24
Default

Kesner, have you solved the problem by not using threading. I have the same problems for segment_juncs
Processed 4000000 root segment groupssi
Error: ReadStream::getRead() called with out-of-order id#!

I'm using Tophat 1.4.1 (I have the same error for 2.0.3, but it's from tophat_reports). And it should not be a memory problem because I have 96G RAM. Therefore maybe something related to threading.
Quote:
Originally Posted by kesner View Post
I get a similar error. However, I get a different indication (see title).
After looking at the code, I think the error has to do with threading on multiple cores and Read_ids. In the section of the code I looked at, read_ids are handled distinctly for threaded and non-threaded code (I think). Am running latest (2.0.3). Am trying again without threading.
Auction is offline   Reply With Quote
Old 06-18-2012, 08:30 AM   #12
kesner
Member
 
Location: ma

Join Date: Apr 2012
Posts: 10
Default re: problem fixed?

I think I get passed the problem by using single treading. Since there are many process on the machine I am using, it is possible some other resource failure was to blame.

Now my problem is that it is taking forever for the run to complete. Alignments are finished but the code does about 1 chr a day to process junctions. On the other hand, I'm not sure throwing multiple cores at this step does anything. I know my reads are contaminated with a lot of background. I figure that this is why I am having problems with the whole process in general.
kesner is offline   Reply With Quote
Old 06-28-2012, 05:18 AM   #13
Auction
Member
 
Location: california

Join Date: Jul 2009
Posts: 24
Default

I agreed that there should be something wrong with the resource allocation. I re-run some samples (also multi-threading), sometimes it got the same error message, sometimes I can finish it successfully. There this problem is not repeatable, and maybe very related the computer situation at running time.



Quote:
Originally Posted by kesner View Post
I think I get passed the problem by using single treading. Since there are many process on the machine I am using, it is possible some other resource failure was to blame.

Now my problem is that it is taking forever for the run to complete. Alignments are finished but the code does about 1 chr a day to process junctions. On the other hand, I'm not sure throwing multiple cores at this step does anything. I know my reads are contaminated with a lot of background. I figure that this is why I am having problems with the whole process in general.
Auction is offline   Reply With Quote
Old 06-28-2012, 06:21 AM   #14
kesner
Member
 
Location: ma

Join Date: Apr 2012
Posts: 10
Default Does latest tophat version solve problem?

I was wondering If you still see the problem with the latest code build of tophat2?
kesner is offline   Reply With Quote
Old 07-03-2012, 08:57 AM   #15
ians
Member
 
Location: St. Louis, MO

Join Date: Aug 2011
Posts: 53
Default

I am getting the same error with tophat 2.0.0.

tophat.log:
Code:
....
[2012-06-30 11:50:15] Mapping right_kept_reads.m2g_um_seg4 against mm9.fa with Bowtie2 (4/4)
/usr/local/bin/tophat-2.0.0/fix_map_ordering: /lib64/libz.so.1: no version information available (required by /usr/local/bin/tophat-2.0.0/fix_map_ordering)
[2012-07-01 00:20:11] Searching for junctions via segment mapping
        [FAILED]
Error: segment-based junction search failed with err =1
Error: ReadStream::getRead() called with out-of-order id#!
segment_juncs.log:
Code:
...
        Loading chrUn_random...done
        Loading chrX_random...done
        Loading chrY_random...done
        Loading ...done
>> Performing segment-search:
Loading left segment hits...
Error: ReadStream::getRead() called with out-of-order id#!
Has anyone uncovered anything recently? At U Texas, they report that single threading allowed proper execution. Does anyone know how to be able to "continue" the tophat procedure and restart from the segment-based junction search?? I'm going to try hacking the python script, but i hope someone has done it before. I have a dozen or so samples that have aligned for about a week. Don't want to redo the alignments, especially with only one core (OUCH!!)
ians is offline   Reply With Quote
Old 07-03-2012, 10:02 AM   #16
ians
Member
 
Location: St. Louis, MO

Join Date: Aug 2011
Posts: 53
Default

I found the segment_juncs command that died in runs.log.
I tried rerunning the exact command, but with p=1 (singlethreaded), and get the same error.

I've dug through the source of reads.cpp to find that apparently read access must be sequential. Does anyone know which file these reads are listed?
ians is offline   Reply With Quote
Old 07-06-2012, 08:46 AM   #17
ians
Member
 
Location: St. Louis, MO

Join Date: Aug 2011
Posts: 53
Default

I'm still digging through source hoping for some light. Any insight would be strongly appreciated.
ians is offline   Reply With Quote
Old 07-09-2012, 08:03 AM   #18
ians
Member
 
Location: St. Louis, MO

Join Date: Aug 2011
Posts: 53
Default Solved?

I ran a test run over the weekend using data that had failed before. The run was successful over the weekend!
I ran with just as many threads as the box has. This contrasts with the pervious failed runs where i called for more threads than existed (2x).
I'm optimistic that i'll be able to successfully rerun the other samples as well. I'll let you know if this isn't the case.
ians is offline   Reply With Quote
Old 07-13-2012, 02:41 AM   #19
MichalO
Member
 
Location: CH

Join Date: Jan 2011
Posts: 10
Default

seems to be a really random issue... I have it in one sample only - three others have been run happily. I'm curious about the explanation
MichalO is offline   Reply With Quote
Old 07-13-2012, 06:47 AM   #20
ians
Member
 
Location: St. Louis, MO

Join Date: Aug 2011
Posts: 53
Default

For me it was pretty systematic.

My solution seems to work thus far as I have rerun 3 samples that previously failed.

I think the problem may stem from when Cufflinks' threads are competing at the scheduler with other processes. Many time people report this error when they share compute resources on a cluster. Perhaps this ordering is basically a race condition where threads can become out of sync, resulting in unexpected orders of return.

Last edited by ians; 07-13-2012 at 06:49 AM.
ians is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:01 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO