SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing shotgun metagenomic reads: 454 vs HiSeq thesoundd Metagenomics 7 05-16-2015 12:39 PM
Metagenomic assembly (filter low complexity reads) rsinha Bioinformatics 0 10-24-2012 01:24 PM
Assembly of short reads in Metagenomic studies chanderbio Metagenomics 9 09-01-2011 05:09 PM
PubMed: Orphelia: predicting genes in metagenomic sequencing reads. Newsbot! Literature Watch 0 05-12-2009 06:00 AM

Reply
 
Thread Tools
Old 08-29-2013, 11:52 AM   #1
ssully
Member
 
Location: NYC

Join Date: Aug 2010
Posts: 48
Default filtering out human seqs from metagenomic reads

What's considered to be the best tool for this -- removing human sequences from large sets of metagenomic next-gen reads? We tried BMTagger at default values , on a set of ~200 million 100nt Illumina reads, and it left in a lot of reads that hit human seqs with high confidence in subsequent blastn search vs. the NCBI nt database.
ssully is offline   Reply With Quote
Old 08-29-2013, 01:38 PM   #2
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

What about bowtie2 against the human genome? They even have prebuilt indexes available. Blastn of over 100M reads against nt sounds rather wasteful use of computing resources..
rhinoceros is offline   Reply With Quote
Old 08-29-2013, 01:46 PM   #3
ssully
Member
 
Location: NYC

Join Date: Aug 2010
Posts: 48
Default

Yes, it's far from good. But that's how many were left in our metagenomic set after filtering out short reads, duplicate reads, and (via BMTagger) human reads. So we''d like to try a better human read remover, to help insure that the final read set for downstream analysis (e.g. blastn) is all nonhuman. And smaller.
ssully is offline   Reply With Quote
Old 08-29-2013, 01:49 PM   #4
rhinoceros
Senior Member
 
Location: sub-surface moon base

Join Date: Apr 2013
Posts: 372
Default

Quote:
Originally Posted by ssully View Post
Yes, it's far from good. But that's how many were left in our metagenomic set after filtering out short reads, duplicate reads, and (via BMTagger) human reads. So we''d like to try a better human read remover, to help insure that the final read set for downstream analysis (e.g. blastn) is all nonhuman. And smaller.
If I were you, I'd do trimming, bowtie2 against the human genome, assembly, and then blasts. Although for certain things like species distribution, assembly tends to introduce rather big bias (in my experience it increases the apparent presence of the most common taxa).

p.s. If you have human reads, you probably have other contaminants too, like bacteria from human skin among other stuff. Keep that in mind especially if your contamination rate is high..

Last edited by rhinoceros; 08-29-2013 at 01:51 PM.
rhinoceros is offline   Reply With Quote
Old 08-29-2013, 02:39 PM   #5
ssully
Member
 
Location: NYC

Join Date: Aug 2010
Posts: 48
Default

We don't want to do assembly, because our main goal is to interrogate the diversity of taxa in our samples. We've done quality score filtering, length filtering, adapter trimming, duplicate removal - more vigorous quality trimming may be detrimental to uncovering diversity according to this study

We are studying a surface microbiome that humans interact with, so we don't mind skin bacteria; we want to catalog those, as well as any eukaryotic seqs. We don't even 'mind' the human sequences, it's just that their numbers make the seq files very large, so we want to split them out and treat human/nonhuman sets separately.

Last edited by ssully; 08-29-2013 at 02:49 PM.
ssully is offline   Reply With Quote
Old 08-29-2013, 05:58 PM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,080
Default

Perhaps one of these would be useful:

http://edwards.sdsu.edu/labsite/inde...om-metagenomes

http://clovr.org/hmp-dacc/hmp-dacc-c...g-walkthrough/
GenoMax is offline   Reply With Quote
Old 09-27-2013, 10:06 AM   #7
lac302
Member
 
Location: DE

Join Date: Dec 2012
Posts: 65
Default

deconseq?...i haven't used it for anything larger than microbial genomes, but it works fairly well.
lac302 is offline   Reply With Quote
Old 10-24-2013, 03:37 AM   #8
leopal
Junior Member
 
Location: Spain

Join Date: Oct 2013
Posts: 1
Default Great help!

Quote:
Originally Posted by GenoMax View Post
These links are quite good options!

Thanks!
leopal is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:13 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO