Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • shotgun metagenomic sequencing coverage

    My lab has been working on bacterial 16S-based microbiome analysis in mouse gut and fecal samples. Recently I've been thinking to move on to Illumina shotgun metagenomic sequencing if we could afford. So I'd like to collect some information to see if it is affordable to my lab.

    Assuming there are 20 dominant bacterial species to be assembled,
    we will need 20 * 100x (depth for de novo assembling) * 4Mb (estimated size of bacterial genome) = ~8Gb per mouse sample ?

    Is the above estimate OK?
    Is there any suggestion on the ideal NGS depth and converge needed for each mouse sample ( for fecal or gut bacteria)?

    How many mice per condition is recommended?

    Could we use the same kit to isolate the fecal bacterial DNA?

    Any other concerns on switching from 16S NGS to shotgun metagenomic sequencing?
    (I also worried about the bioinformatic part for shotgun metagenomic sequencing)


    Thanks.
    Last edited by neokao; 06-08-2015, 07:37 PM.

  • #2
    Metagenomes often have an exponential coverage distribution which makes it difficult to determine how much depth you need (other than "more"). Your calculation is fine if the species distribution is perfectly even, but it won't be. Of course, whether you do shotgun or 16s depends on what you are trying to accomplish... as does the number of replicates. Perhaps you could elaborate on your goal? If you just want to quantify abundances for various conditions, there's no reason to do denovo assemblies each time; just assemble once and use mapping to calculate abundance subsequently. It's possible that mouse gut bacteria already have good assemblies, in which case you wouldn't need to assemble at all unless you notice a large amount of reads not mapping to known gut bacteria.

    For assembly, I'd suggest starting with a HiSeq lane (~100Gbp) at minimum. For mapping, you could use a tiny fraction of that.

    Comment


    • #3
      Originally posted by neokao View Post
      My lab has been working on bacterial 16S-based microbiome analysis in mouse gut and fecal samples. Recently I've been thinking to move on to Illumina shotgun metagenomic sequencing if we could afford. So I'd like to collect some information to see if it is affordable to my lab.

      Assuming there are 20 dominant bacterial species to be assembled,
      we will need 20 * 100x (depth for de novo assembling) * 4Mb (estimated size of bacterial genome) = ~8Gb per mouse sample ?

      Is the above estimate OK?
      I guess one should also allow some reads for DNA from mouse and also gut contents.

      Comment


      • #4
        Thanks for the suggestions. I agree and my goal is quite simple:
        I have a knock-out mouse with a special gut microbiota-associated phenotype. So I'd like to know the difference (bacterial species and abundance) of fecal microbiota between wild-type mice and my knock-out mice.
        If there is no need to do de novo assemble, how would you suggest me to start (NGS coverage, depth and mouse numbers)? The host (mouse) DNA content will be excluded, I think.


        Originally posted by Brian Bushnell View Post
        Metagenomes often have an exponential coverage distribution which makes it difficult to determine how much depth you need (other than "more"). Your calculation is fine if the species distribution is perfectly even, but it won't be. Of course, whether you do shotgun or 16s depends on what you are trying to accomplish... as does the number of replicates. Perhaps you could elaborate on your goal? If you just want to quantify abundances for various conditions, there's no reason to do denovo assemblies each time; just assemble once and use mapping to calculate abundance subsequently. It's possible that mouse gut bacteria already have good assemblies, in which case you wouldn't need to assemble at all unless you notice a large amount of reads not mapping to known gut bacteria.

        For assembly, I'd suggest starting with a HiSeq lane (~100Gbp) at minimum. For mapping, you could use a tiny fraction of that.

        Comment


        • #5
          You will need to denovo-assemble the data unless you already know (or think you know) the bacteria you will be sequencing, and their genomes have already been assembled.

          As for excluding the host DNA, that's going to be done by mapping; you still need to account for extra coverage that will be "wasted" on mouse sequence, because no matter what the protocol is, there will be mouse DNA mixed in with the bacterial DNA, and it will get sequenced.

          If you don't need to de-novo assemble, I would aim for about 0.1x depth on the lowest-abundance species that you want to quantify. But, you can't really know how much you'll need for that ahead of time since you don't know the abundance Just sequence a lane (~300M read pairs) and then decide if you need more, if you plan to assemble; if not, you will probably be able to do useful quantification, at least of the most abundant species, with ~40M single-ended 100bp reads. For quantification of species abundance by mapping, if you don't do assembly, single-end 100bp reads are fine and will be (in some ways) only 1/3rd the price of the 2x150bp paired reads that would be much better for assembly.

          Comment


          • #6
            Hi, for metagenomic, this site might help. But I'm not sure.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            67 views
            0 likes
            Last Post seqadmin  
            Working...
            X