SEQanswers

Go Back   SEQanswers > Literature Watch
Similar Threads
Thread Thread Starter Forum Replies Last Post
PubMed: A framework for analysis of metagenomic sequencing data. Newsbot! Literature Watch 0 12-02-2010 02:00 AM
PubMed: Identifying and removing artificial replicates from 454 pyrosequencing data. Newsbot! Literature Watch 0 04-03-2010 02:02 AM
PubMed: Wrinkles in the rare biosphere: pyrosequencing errors lead to artificial infl Newsbot! Literature Watch 0 09-04-2009 02:01 AM
PubMed: Orphelia: predicting genes in metagenomic sequencing reads. Newsbot! Literature Watch 0 05-12-2009 05:00 AM
PubMed: Metagenomic Pyrosequencing and Microbial Identification. Newsbot! Literature Watch 0 03-07-2009 05:20 AM

Reply
 
Thread Tools
Old 04-15-2010, 02:00 AM   #1
Newsbot!
RSS Posting Maniac
 

Join Date: Feb 2008
Posts: 1,443
Default PubMed: Artificial and natural duplicates in pyrosequencing reads of metagenomic data

Syndicated from PubMed RSS Feeds

Artificial and natural duplicates in pyrosequencing reads of metagenomic data.

BMC Bioinformatics. 2010 Apr 13;11(1):187

Authors: Niu B, Fu L, Sun S, Li W


ABSTRACT: BACKGROUND: Artificial duplicates from pyrosequencing reads may lead to incorrect interpretation of the abundance of species and genes in metagenomic studies. Duplicated reads were filtered out in many metagenomic projects. However, since the duplicated reads observed in a pyrosequencing run also include natural (non-artificial) duplicates, simply removing all duplicates may also cause underestimation of abundance associated with natural duplicates. RESULTS: We implemented a method for identification of exact and nearly identical duplicates from pyrosequencing reads. This method performs an all-against-all sequence comparison and clusters the duplicates into groups using an algorithm modified from our previous sequence clustering method cd-hit. This method can process a typical dataset in ~10 minutes; it also provides a consensus sequence for each group of duplicates. We applied this method to the underlying raw reads of 39 genomic projects and 10 metagenomic projects that utilized pyrosequencing technique. We compared the occurrences of the duplicates identified by our method and the natural duplicates made by independent simulations. We observed that the duplicates, including both artificial and natural duplicates, make up 4-44% of reads. The number of natural duplicates highly correlates with the samples' read density (number of reads divided by genome size). For high-complexity metagenomic samples lacking dominant species, natural duplicates only make up
Newsbot! is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 02:47 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO