Hi All,
I'd like to get more involved with metagenomic research so I've been using QIIME to analyze some 16S data. As such, I've been reading the literature and had a question about some of the quantitative surveys of bacterial diversity.
A paper I read recently describes some of the disadvantages of using 16S profiling to estimate community composition http://www.ncbi.nlm.nih.gov/pubmed/22355318. One of the disadvantages they mention is that most bacterial genomes contain multiple copies of the 16S gene, and the number of copies can range from 1 to 15. As an example, E.coli K-12 has 7 copies of the 16S gene.
In metagenomics sequencing surveys, it is possible that any or all of the multiple copies of the 16S gene in each genome will be sequenced. Since different species will have a different copy number of the 16S gene, how do you know the species/genus counts aren't inflated by double-, triple-, or 15X-counting the 16S gene for a particular group of species? Can studies be quantitative without normalizing for the 16S copy number in particular clades (i.e. if firmicutes have on average a greater 16S copy number than bacteroidetes you would expect more counts of firmicutes with the same popluation numbers)? Although I'm thinking about two papers off the top of my head (here and here), I think it applies to many studies.
Thanks
I'd like to get more involved with metagenomic research so I've been using QIIME to analyze some 16S data. As such, I've been reading the literature and had a question about some of the quantitative surveys of bacterial diversity.
A paper I read recently describes some of the disadvantages of using 16S profiling to estimate community composition http://www.ncbi.nlm.nih.gov/pubmed/22355318. One of the disadvantages they mention is that most bacterial genomes contain multiple copies of the 16S gene, and the number of copies can range from 1 to 15. As an example, E.coli K-12 has 7 copies of the 16S gene.
In metagenomics sequencing surveys, it is possible that any or all of the multiple copies of the 16S gene in each genome will be sequenced. Since different species will have a different copy number of the 16S gene, how do you know the species/genus counts aren't inflated by double-, triple-, or 15X-counting the 16S gene for a particular group of species? Can studies be quantitative without normalizing for the 16S copy number in particular clades (i.e. if firmicutes have on average a greater 16S copy number than bacteroidetes you would expect more counts of firmicutes with the same popluation numbers)? Although I'm thinking about two papers off the top of my head (here and here), I think it applies to many studies.
Thanks
Comment