Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat ERROR: Segment join failed with err = 1

    Hello all,

    I am running TopHat for the first time, but but it gives error. Could you please have a look at the output?

    Preparing output location ./tophat_out/
    Checking for Bowtie index files
    Checking for reference FASTA file
    Warning: Could not find FASTA file /bowtie-0.12.5/indexes/xxxxa.fa
    Reconstituting reference FASTA file from Bowtie index
    Checking for Bowtie
    Bowtie version: 0.12.x.0
    Checking for Samtools
    Samtools Version: 0.1.16
    Checking reads
    min read length: 36bp, max read length: 36bp
    format: fastq
    quality scale: phred33 (default)
    Mapping reads against xxxx with Bowtie
    Joining segment hits
    [FAILED]
    Error: Segment join failed with err = 1

  • #2
    still relevant...

    Comment


    • #3
      Look in the log files there may be more detailed errors.

      Comment


      • #4
        Here is what was in the logs:
        # reads processed: 29130490
        # reads with at least one reported alignment: 23830715 (81.81%)
        # reads that failed to align: 4412560 (15.15%)
        # reads with alignments suppressed due to -m: 887215 (3.05%)
        Reported 89574818 alignments to 1 output stream(s)
        long_spanning_reads v1.2.0 (1752)
        --------------------------------------------
        Opening /dev/null for reading
        Opening /dev/null for reading
        Opening /dev/null for reading
        Opening ./tophat_out/tmp/left_kept_reads.bwtout for reading
        Loading reference sequences...
        Loading spliced hits...done
        Loading junctions...done
        Loading deletions...done
        Error: could not get read # 34344441 from stream
        prep_reads v1.2.0 (1752)
        ---------------------------
        13364 out of 29143854 reads have been filtered out
        /usr/local/bin/tophat -r 100 /bowtie-0.12.5/indexes/xxxx data/sequence.txt /usr/local/bin/prep_reads --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./tophat_out/ --max-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --sam-header ./tophat_out/tmp/stub_header.sam --max-insertion-length 3 --max-deletion-length 3 --inner-dist-mean 100 --inner-dist-std-dev 20 --no-microexon-search --fastq /sequence.txt
        bowtie -q --un ./tophat_out/tmp/left_kept_reads_missing.fq --max /dev/null -v 2 -p 1 -k 40 -m 40 bowtie-0.12.5/indexes/xxxx ./tophat_out/tmp/left_kept_reads.fq | /usr/local/bin/fix_map_ordering --fastq ./tophat_out/tmp/left_kept_reads.fq - > .//usr/local/bin/long_spanni/usr/local/bin/long_spanning_reads --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir ./tophat_out/ --max-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --sam-header ./tophat_out/tmp/stub_header.sam --max-insertion-length 3 --max-deletion-length 3 --inner-dist-mean 100 --inner-dist-std-dev 20 --no-microexon-search ./tophat_out/tmp/xxxx.fa ./tophat_out/tmp/left_kept_reads.fq /dev/null /dev/null /dev/null ./tophat_out/tmp/left_kept_reads.bwtout > ./tophat_out/tmp/file2SmeWx

        Comment


        • #5
          ...and it has also created a directory /tmp, with several large files, some of them seem to look similar to BED format, but none of them has a formal extension, so it seems that the program did not finish working, and these are really temporal files.

          Any suggestions?

          Comment


          • #6
            Yes those are temp files created by Tophat that are usually automatically deleted after Tophat runs successfully. I'm not sure exactly what's causing your problem, there's just the error "Error: could not get read # 34344441 from stream". Are you using an old version of Tophat? It looks like you're using v1.2.0, maybe try the newest version 1.3.3. I don't know if that will solve the problem but it's the first thing I would try.

            Comment


            • #7
              I am using the latest version of TopHat, 1.3.3.
              I just downloaded and installed it yesterday.
              I do not know why it lists version 1.2.0.
              My bowtie version is very outdated but it seems
              that this is not the problem with Bowtie,
              because bowtie has mapped the reads.
              Can it be because of the memory problem or something like this?

              Comment


              • #8
                Ah ok the 1.2.0 must mean Bowtie then. I don't think it's a memory issue, I ran out of memory on a Tophat run and got a -6 error during "Searching for junctions via segment mapping", I think error 1 must be something else, but not sure. How much memory do you have and what species are you aligning to? You should be able to align to the mouse or human genome with under 4GB.

                Comment


                • #9
                  yeah, I have more than 4 GB memory.

                  Comment


                  • #10
                    just on case if someone else is getting this error,
                    I found a similar thread. It seems that this issue
                    has not been resolved yet, or the issue was different
                    for different people.
                    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

                    In my case the disc space is not the problem for sure.

                    Comment


                    • #11
                      rebrendi, About the tophat 1.2.0 and 1.3.3 version difference, are you working on a cluster? if so, the default "tophat" path (installed by your admin) maybe linked to version 1.2.0. This is a very silly thing, but happens more than one might think, as its often overlooked. If so, you'll have to add your tophat path before the default $PATH in your .bash_profile file.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      30 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      32 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X