Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How many reads/replicates do I need for bacterial RNA-seq? MiSeq?

    I'm aware of the ENCODE best practices and other recent research that give guidelines about the number of reads you need for RNA-seq in mammalian genomes. I generally recommend ~40-50M/sample for most applications, as low as 20M if the goal is just expression at the gene level, 100M+ if the goal is rare/aberrant isoform identification. I'm taking on a bacterial RNA-seq project where the goal is to identify differentially expressed genes and isoforms in a WT vs mutant strain of pathogenic F. tularensis. While there isn't splicing, I'm coming to appreciate that the prokaryotic transcriptome is still complex - overlapping genes, strand specificity, sRNAs, etc.

    1. How many reads do I need? What length? Is paired end sequencing as necessary as with complex (spliced) mammalian genomes? A recent paper gave guidelines about another bacteria, P. syringae. With 3.5 million prefiltered reads they were able to cover 95% of the annotated genes with at least 10 reads (average 190). P. syringae has a larger genome and about 3 times as many annotated ORFs as our bacteria, F. tularensis. So can I get away with fewer reads, say, 2 million before any filtering?

    2. If I'm about right on #1 above, needing ~2M reads/sample, and I want to sequence, say, 2-4 samples from each condition (WT vs Mut), what's my best choice for platform? Will MiSeq have the capacity to do this on a single flowcell, or should I use a single lane on our GAIIx?

    3. What counts as a biological replicate in this case? I would imagine taking aliquots from the same flask would be more like technical replication, and taking two different flasks grown from two different colonies to be biological replicates. Am I thinking about this correctly?

  • #2
    Hi Turner,

    Regarding the replicates, you are thinking of that correctly. The important part to replicates is to replicate around your largest source of experimental variation which is usually (not always) biological. For the comment on 2-4, I would change that to 3-5. 2 imo is never an option and really is no better than 1.

    For read length and paired vs single, there are a few publications out there now that state that short single is sufficient. The RSEM paper describes this as well. We did a little study where we had 101 PE data from mouse and in silico created a set of data sets that ranged from 36 cycle SE, 36 cycle PE, up to the full data set including partial read subsets to explore multiplexing possibilities. We looked at our sensitivity to splice variants and detection of known transcript d/dx. What we found was that somewhere between 50 and 76 cycle SE was the optimum which includes a little personal bias towards longer reads. The multiplexing question is a bit more ambiguous so we really don't (yet? not sure) have a good handle on that. What we have been telling people is that if you have to choose between long and more, choose more.

    On the MiSeq vs GA, for the MiSeq, you will be doing 2-3 at a time for 2-3M reads per replicate while if Yongde has a good run, you should be able to do all 6 (thinking triplicates) in one go and get 2-3M+ per replicate. Tell your core you want >30M reads.

    Good luck.
    GO CAVS!
    Last edited by bioBob; 03-01-2012, 05:20 AM.

    Comment


    • #3
      In your 2 million reads you have to take into account whether the original RNA has been rRNA depleted or not. If your libraries are from total RNA and there was no ribosomal RNA depletion you will not get sufficient mRNA coverage in 2 million reads.

      I agree PE is not required for all our bacterial libraries we find 42 cycles to be sufficient.

      With regard to the biorep question, you are correct sampling from the same flask constitutes a technical replicate not a biological one.

      Best of luck.

      Comment


      • #4
        You might get something out of this paper, and its supplemental:

        Fermenting microbial communities generate hydrogen; its removal through the production of acetate, methane, or hydrogen sulfide modulates the efficiency of energy extraction from available nutrients in many ecosystems. We noted that pathway components for acetogenesis are more abundantly and consist …


        They did a lot of trial and error to find the best way to do bacterial RNA-Seq on in vivo populations. Interestingly, they did a rarefaction analysis and found that above 300,000 reads aligning to mRNA, not much more information is gained.

        Comment


        • #5
          In 2012 there was a quite robust study to determine the amount of reads needed for transcriptome analysis and differential gene expression studies with RNA-Seq in bacteria:
          http://www.ncbi.nlm.nih.gov/pubmed?t...transcriptomes[all]&cmd=correctspelling

          HTH

          Comment


          • #6
            Resource for Bacterial RNA-seq depth recommendations

            This paper "How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes?" (https://bmcgenomics.biomedcentral.co...71-2164-13-734) does a nice job of going through the key library prep and sequencing depth parameters for a bacterial RNA-seq experiment, and concludes that "5-10 million reads per sample [...] are sufficient for most applications of bacterial RNA-Seq".

            From my experience, the critical experimental design question for a bacterial RNA-seq experiment is how well will the rRNAs be depleted (i.e.: how many reads will not be covering target genes) and how many replicates can I perform (i.e.: what will be my statistical power). Depth is helpful to overcome poor rRNA depletion but adding more replicates is the better use of increased sequencing cost, in my opinion.
            Last edited by dmking; 02-14-2020, 01:42 PM. Reason: Added additional notes for depth vs replicates

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            9 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X