Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tophat2 "joining segment hits" does not complete

    Tophat2 runs nicely (6 hours) up to the step "joining segment hits". But this step (single core process, named "long_spanning_reads") is running now for almost one week.
    data: 454-sequencing reads up to 500 nucleotides long probably containing a lot of exon-exon junctions.
    Tophat version 2.0.0
    Bowtie version: 2.0.0.6
    here is the command:
    tophat2 -p 12 -o tophat_out genome 454_data.fastq.gz

    Does anyone has an idea what could cause this? Or maybe someone knows which parameter can be adjusted to reduce the time for this step.

    Thanks
    Last edited by Sniwells; 10-17-2012, 11:20 AM.

  • #2
    Hello Sniwells,

    I'm having the same problem here. I've 12 libraries from 2 runs (454) and Tophat gets stuck in that step. The most annoying thing is that in some libraries Tophat did a quick alignment (around 5-10h) but for 3 of them it took 1 week to complete, and I'm still waiting for last 4 to complete (more than 12 days).

    I've not found any topics related to this problem in this forum and neither have received any answers from the authors of Tophat. Since the libraries were have a very similar amount of reads and 454 does not seem to be the most popular choice for RNA-seq, my thought is that this might have to do with the length of the reads (which in 454 data is way bigger than in Illumina's).

    Anybody's got a clue?

    Comment


    • #3
      did you put "--no-coverage-search"? if not it will take very long time.

      Comment


      • #4
        Right after your suggestion I started tophat with the following command:
        tophat2 --no-coverage-search -p 12 -o tophat_out genome 454_data.fastq.gz
        But it seems as if this parameter does not solve this problem, because tophat2 is stucked at the same point since the day of of your post.
        Maybe tophat is not designed for long 454 reads?

        Comment


        • #5
          at which step?
          when tophat is writing segment, junction files, it will take a few days or even a week.

          Originally posted by Sniwells View Post
          Right after your suggestion I started tophat with the following command:
          tophat2 --no-coverage-search -p 12 -o tophat_out genome 454_data.fastq.gz
          But it seems as if this parameter does not solve this problem, because tophat2 is stucked at the same point since the day of of your post.
          Maybe tophat is not designed for long 454 reads?
          Last edited by HSV-1; 10-22-2012, 05:38 PM.

          Comment


          • #6
            Originally posted by HSV-1 View Post
            at which step?
            The same step:
            "Tophat2 runs nicely (6 hours) up to the step "joining segment hits". But this step (single core process, named "long_spanning_reads") is running now for almost one week."

            Comment


            • #7
              sort of normal.
              be sure your que is tolerant for this comsumed time or it will be killed w/o accomplishment.

              Originally posted by Sniwells View Post
              The same step:
              "Tophat2 runs nicely (6 hours) up to the step "joining segment hits". But this step (single core process, named "long_spanning_reads") is running now for almost one week."

              Comment


              • #8
                Originally posted by HSV-1 View Post
                sort of normal.
                be sure your que is tolerant for this comsumed time or it will be killed w/o accomplishment.
                The process is still running, (14 days). Let's see if there will be a happy end.

                Comment


                • #9
                  I stopped the process after it was running for nearly a month.
                  Does anyone has run tophat with long 454-reads successfully?

                  Comment


                  • #10
                    I didn't know your reads are from roche 454.
                    There is special protocol for long reads.

                    Comment


                    • #11
                      Hello again!

                      I've finally managed to make Tophat2 work on my problematic 454 reads. What I did is splitting my original fastq file into several ones and run Tophat separately for each of them. Then take the sub-file that takes longer to finish, split it in sub-sub-files and run Tophat again on each of them.

                      After several rounds, I came across a single read that, if erased in the original fastq file, makes Tophat work smooth and fast.

                      I still don't know what makes those reads special as they are not the longest, nor the shortest, nor showing bad quality...

                      Anyway, hope it works.
                      Pablo

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      22 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      19 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      50 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X