Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Minimal RNA seq dataset for performance eval

    Hello all. I'd like to compare the speed for the tophat/cuffdiff protocol on various hardware platforms to see which is the speediest combination. Is there a small genome/sample set that anyone knows of that would be ideal for testing? The run should be quick (no more than 15 minutes per run) and ideally produce some differential expression data or at least be verified.

    My default approach is going to be to use the genome I already have for my samples and then just chop off a small piece of left and right pair end reads (probably 10MB worth or so). I doubt I'll see anything interesting in the output but at least it will be a speed test.

    Ideas are welcomed. Sorry if this question is naive, I'm just starting with RNA seq. Any information on speeding on rna seq would be welccomed. I just found STAR which looks interesting, but don't know much about it (http://gingeraslab.cshl.edu/STAR/).

  • #2
    You could use the dataset suggested by Cole in their protocol paper: http://www.nature.com/nprot/journal/....2012.016.html. If you down sample it then you may or may not get interesting results from the set.

    STAR is fast and is the choice of aligner several people use. If your genome is large then be ready to have ample memory available for STAR use.

    Comment


    • #3
      Thanks GenoMax. It so happens that's the paper and protocol we started with, so I've considered using the iGenome fly data, but the problem is the genome is still hefty as are the data files. I think they compare with a normal RNA seq data set. The yeast genome is smaller, but I couldn't find the published diff. gene data like they had for the fly.

      I was just in touch with the author of STAR and will be trying it this week. We have two platforms, one has 20GB/4cores and one has 24GB/8cores. If we can run tophat, I'm assured STAR will be much faster. It should be educational to compare the data that's generated.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X