Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TopHat approximate run time & memory usage?

    Hi everyone,

    I'm using TopHat to map my RNA-seq reads to splice junctions for use in Cufflinks, but it's been taking a bit longer than I expected. One sample I ran (as an initial test) has been going for ~34 hours, and the tophat_out folder is using up around 76 GB of space. By looking in the logs, it seems to be on the "segment_juncs" step.

    Each of the individual samples I hope to align have roughly 90 million reads (@ 50 nt) over 3 lanes, and I'm aligning to the human genome (hg19).

    Would anyone know how long I should expect the program to run for, and also how much disk space it'll need per sample?

    Thanks!

    edit: I'm using single-end reads
    Last edited by xinchen; 05-16-2010, 07:06 PM.

  • #2
    It routinely takes ~40 hours for me (~45 million reads, with - p 4 to use four threads and 16G RAM). It will delete the huge temp files it generates, so the disk usage may not be that much of any issue.

    Comment


    • #3
      I'm not sure but i think that TopHat need for 75million reads about 3 Days...

      Comment


      • #4
        Thanks! It ended up taking ~50 hours for my test sample, which isn't too long

        Comment


        • #5
          How can you guys compare run time and memory usage without stating the CPU and RAM you are using???

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            04-22-2024, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 08:47 AM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          60 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Working...
          X