SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Metagenomics (http://seqanswers.com/forums/forumdisplay.php?f=29)
-   -   shotgun metagenomic sequencing coverage (http://seqanswers.com/forums/showthread.php?t=59565)

neokao 06-08-2015 07:28 PM

shotgun metagenomic sequencing coverage
 
My lab has been working on bacterial 16S-based microbiome analysis in mouse gut and fecal samples. Recently I've been thinking to move on to Illumina shotgun metagenomic sequencing if we could afford. So I'd like to collect some information to see if it is affordable to my lab.

Assuming there are 20 dominant bacterial species to be assembled,
we will need 20 * 100x (depth for de novo assembling) * 4Mb (estimated size of bacterial genome) = ~8Gb per mouse sample ?

Is the above estimate OK?
Is there any suggestion on the ideal NGS depth and converge needed for each mouse sample ( for fecal or gut bacteria)?

How many mice per condition is recommended?

Could we use the same kit to isolate the fecal bacterial DNA?

Any other concerns on switching from 16S NGS to shotgun metagenomic sequencing?
(I also worried about the bioinformatic part for shotgun metagenomic sequencing)


Thanks.

Brian Bushnell 06-08-2015 08:24 PM

Metagenomes often have an exponential coverage distribution which makes it difficult to determine how much depth you need (other than "more"). Your calculation is fine if the species distribution is perfectly even, but it won't be. Of course, whether you do shotgun or 16s depends on what you are trying to accomplish... as does the number of replicates. Perhaps you could elaborate on your goal? If you just want to quantify abundances for various conditions, there's no reason to do denovo assemblies each time; just assemble once and use mapping to calculate abundance subsequently. It's possible that mouse gut bacteria already have good assemblies, in which case you wouldn't need to assemble at all unless you notice a large amount of reads not mapping to known gut bacteria.

For assembly, I'd suggest starting with a HiSeq lane (~100Gbp) at minimum. For mapping, you could use a tiny fraction of that.

nucacidhunter 06-09-2015 03:11 AM

Quote:

Originally Posted by neokao (Post 174479)
My lab has been working on bacterial 16S-based microbiome analysis in mouse gut and fecal samples. Recently I've been thinking to move on to Illumina shotgun metagenomic sequencing if we could afford. So I'd like to collect some information to see if it is affordable to my lab.

Assuming there are 20 dominant bacterial species to be assembled,
we will need 20 * 100x (depth for de novo assembling) * 4Mb (estimated size of bacterial genome) = ~8Gb per mouse sample ?

Is the above estimate OK?

I guess one should also allow some reads for DNA from mouse and also gut contents.

neokao 06-16-2015 06:56 AM

Thanks for the suggestions. I agree and my goal is quite simple:
I have a knock-out mouse with a special gut microbiota-associated phenotype. So I'd like to know the difference (bacterial species and abundance) of fecal microbiota between wild-type mice and my knock-out mice.
If there is no need to do de novo assemble, how would you suggest me to start (NGS coverage, depth and mouse numbers)? The host (mouse) DNA content will be excluded, I think.


Quote:

Originally Posted by Brian Bushnell (Post 174488)
Metagenomes often have an exponential coverage distribution which makes it difficult to determine how much depth you need (other than "more"). Your calculation is fine if the species distribution is perfectly even, but it won't be. Of course, whether you do shotgun or 16s depends on what you are trying to accomplish... as does the number of replicates. Perhaps you could elaborate on your goal? If you just want to quantify abundances for various conditions, there's no reason to do denovo assemblies each time; just assemble once and use mapping to calculate abundance subsequently. It's possible that mouse gut bacteria already have good assemblies, in which case you wouldn't need to assemble at all unless you notice a large amount of reads not mapping to known gut bacteria.

For assembly, I'd suggest starting with a HiSeq lane (~100Gbp) at minimum. For mapping, you could use a tiny fraction of that.


Brian Bushnell 06-16-2015 01:54 PM

You will need to denovo-assemble the data unless you already know (or think you know) the bacteria you will be sequencing, and their genomes have already been assembled.

As for excluding the host DNA, that's going to be done by mapping; you still need to account for extra coverage that will be "wasted" on mouse sequence, because no matter what the protocol is, there will be mouse DNA mixed in with the bacterial DNA, and it will get sequenced.

If you don't need to de-novo assemble, I would aim for about 0.1x depth on the lowest-abundance species that you want to quantify. But, you can't really know how much you'll need for that ahead of time since you don't know the abundance :) Just sequence a lane (~300M read pairs) and then decide if you need more, if you plan to assemble; if not, you will probably be able to do useful quantification, at least of the most abundant species, with ~40M single-ended 100bp reads. For quantification of species abundance by mapping, if you don't do assembly, single-end 100bp reads are fine and will be (in some ways) only 1/3rd the price of the 2x150bp paired reads that would be much better for assembly.


All times are GMT -8. The time now is 09:17 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.