Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rebrendi
    ng
    • May 2008
    • 78

    TopHat ERROR: Segment join failed with err = 1

    Hello all,

    I am running TopHat for the first time, but but it gives error. Could you please have a look at the output?

    Preparing output location ./tophat_out/
    Checking for Bowtie index files
    Checking for reference FASTA file
    Warning: Could not find FASTA file /bowtie-0.12.5/indexes/xxxxa.fa
    Reconstituting reference FASTA file from Bowtie index
    Checking for Bowtie
    Bowtie version: 0.12.x.0
    Checking for Samtools
    Samtools Version: 0.1.16
    Checking reads
    min read length: 36bp, max read length: 36bp
    format: fastq
    quality scale: phred33 (default)
    Mapping reads against xxxx with Bowtie
    Joining segment hits
    [FAILED]
    Error: Segment join failed with err = 1
  • rebrendi
    ng
    • May 2008
    • 78

    #2
    still relevant...

    Comment

    • biznatch
      Senior Member
      • Nov 2010
      • 124

      #3
      Look in the log files there may be more detailed errors.

      Comment

      • rebrendi
        ng
        • May 2008
        • 78

        #4
        Here is what was in the logs:
        # reads processed: 29130490
        # reads with at least one reported alignment: 23830715 (81.81%)
        # reads that failed to align: 4412560 (15.15%)
        # reads with alignments suppressed due to -m: 887215 (3.05%)
        Reported 89574818 alignments to 1 output stream(s)
        long_spanning_reads v1.2.0 (1752)
        --------------------------------------------
        Opening /dev/null for reading
        Opening /dev/null for reading
        Opening /dev/null for reading
        Opening ./tophat_out/tmp/left_kept_reads.bwtout for reading
        Loading reference sequences...
        Loading spliced hits...done
        Loading junctions...done
        Loading deletions...done
        Error: could not get read # 34344441 from stream
        prep_reads v1.2.0 (1752)
        ---------------------------
        13364 out of 29143854 reads have been filtered out
        /usr/local/bin/tophat -r 100 /bowtie-0.12.5/indexes/xxxx data/sequence.txt /usr/local/bin/prep_reads --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./tophat_out/ --max-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --sam-header ./tophat_out/tmp/stub_header.sam --max-insertion-length 3 --max-deletion-length 3 --inner-dist-mean 100 --inner-dist-std-dev 20 --no-microexon-search --fastq /sequence.txt
        bowtie -q --un ./tophat_out/tmp/left_kept_reads_missing.fq --max /dev/null -v 2 -p 1 -k 40 -m 40 bowtie-0.12.5/indexes/xxxx ./tophat_out/tmp/left_kept_reads.fq | /usr/local/bin/fix_map_ordering --fastq ./tophat_out/tmp/left_kept_reads.fq - > .//usr/local/bin/long_spanni/usr/local/bin/long_spanning_reads --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./tophat_out/ --max-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --sam-header ./tophat_out/tmp/stub_header.sam --max-insertion-length 3 --max-deletion-length 3 --inner-dist-mean 100 --inner-dist-std-dev 20 --no-microexon-search ./tophat_out/tmp/xxxx.fa ./tophat_out/tmp/left_kept_reads.fq /dev/null /dev/null /dev/null ./tophat_out/tmp/left_kept_reads.bwtout > ./tophat_out/tmp/file2SmeWx

        Comment

        • rebrendi
          ng
          • May 2008
          • 78

          #5
          ...and it has also created a directory /tmp, with several large files, some of them seem to look similar to BED format, but none of them has a formal extension, so it seems that the program did not finish working, and these are really temporal files.

          Any suggestions?

          Comment

          • biznatch
            Senior Member
            • Nov 2010
            • 124

            #6
            Yes those are temp files created by Tophat that are usually automatically deleted after Tophat runs successfully. I'm not sure exactly what's causing your problem, there's just the error "Error: could not get read # 34344441 from stream". Are you using an old version of Tophat? It looks like you're using v1.2.0, maybe try the newest version 1.3.3. I don't know if that will solve the problem but it's the first thing I would try.

            Comment

            • rebrendi
              ng
              • May 2008
              • 78

              #7
              I am using the latest version of TopHat, 1.3.3.
              I just downloaded and installed it yesterday.
              I do not know why it lists version 1.2.0.
              My bowtie version is very outdated but it seems
              that this is not the problem with Bowtie,
              because bowtie has mapped the reads.
              Can it be because of the memory problem or something like this?

              Comment

              • biznatch
                Senior Member
                • Nov 2010
                • 124

                #8
                Ah ok the 1.2.0 must mean Bowtie then. I don't think it's a memory issue, I ran out of memory on a Tophat run and got a -6 error during "Searching for junctions via segment mapping", I think error 1 must be something else, but not sure. How much memory do you have and what species are you aligning to? You should be able to align to the mouse or human genome with under 4GB.

                Comment

                • rebrendi
                  ng
                  • May 2008
                  • 78

                  #9
                  yeah, I have more than 4 GB memory.

                  Comment

                  • rebrendi
                    ng
                    • May 2008
                    • 78

                    #10
                    just on case if someone else is getting this error,
                    I found a similar thread. It seems that this issue
                    has not been resolved yet, or the issue was different
                    for different people.
                    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                    In my case the disc space is not the problem for sure.

                    Comment

                    • cedance
                      Senior Member
                      • Feb 2011
                      • 108

                      #11
                      rebrendi, About the tophat 1.2.0 and 1.3.3 version difference, are you working on a cluster? if so, the default "tophat" path (installed by your admin) maybe linked to version 1.2.0. This is a very silly thing, but happens more than one might think, as its often overlooked. If so, you'll have to add your tophat path before the default $PATH in your .bash_profile file.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Today, 08:59 AM
                      0 responses
                      7 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      21 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      29 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...