SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Metagenome analysis - first steps sfranzenburg Bioinformatics 4 03-13-2014 10:31 AM
Metagenomics - connecting workflow steps mstagliamonte Bioinformatics 0 05-16-2013 12:52 PM
Basic steps in prokaryotic RNAseq data analysis Muhidini Bioinformatics 0 05-09-2012 07:27 AM
illumina smallRNA adapter sequence for downstram analysis + miRNA analysis steps ndeshpan Bioinformatics 2 06-14-2011 10:44 PM
PubMed: Initial steps towards a production platform for DNA sequence analysis on the Newsbot! Literature Watch 0 03-09-2011 11:40 AM

Reply
 
Thread Tools
Old 08-05-2015, 02:48 AM   #1
naman
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 8
Default Metagenomics analysis steps

Dear All

I am currently working on shotgun metagenomics data. I must tell i am very new to this work.

I have PE with 100 bp length from Illumina platform.

My aim in to find the functional annotation and also see differentially expressed bacteria in two groups
The steps i am following are below with some question associated with each step.
1) Trimming with trimmomatic but I read in review Article that LUCY is good for metagenomics data trimming.
Q: Can anyone suggest me if there is metagenomics specific trimming program other than Trimmomatics/Prinseq? What is in general average quality value considered?
2) Metagenome assembly
a) Denovo assembler: MetaVelvet and Meta-IDBA, RAY, SOAP, Celera, Eular ( to compare all)
Q: how to check which assembler is giving better contigs. Is there any tool which checks for output of an assembler?
b) Reference based assembly: MIRA, AMOS, Genometa
Q: I have no idea how it works(on contigs or reads or blast output)? Do we need to blast all the reads with all the bacterial references?
Or these tools have their own reference database? Or to do it with bowtie against the reference? Or if someone have better suggestions?
3) Binning:
a) Sequence similarity based binning: MEGAN, IMG/M , MG-RAST
Q: what is the difference between this and reference based?
b) composition-based binning: GroopM and Concoct, Phylopythia
c) Works on both methods: PhymmB and MetaCluster
Q: suggest what is better tool for binning host associated data?
4) Gene prediction tools: MetaGeneMark and Glimmer-MG,mORFind
Q: suggest what is better tool?

5) Q: How can we do functional annotation?

Cheers,

Last edited by naman; 10-01-2015 at 05:04 AM.
naman is offline   Reply With Quote
Old 08-05-2015, 10:39 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,695
Default

Quote:
Originally Posted by naman View Post
1) Trimming with trimmomatic but I read in review Article that LUCY is good for metagenomics data trimming.
Q: Can anyone suggest me if there is metagenomics specific trimming program other than Trimmomatics/Prinseq? What is in general average quality value considered?
Trimming programs are not specifically better for metagenomics or RNA-seq or whatever, like assemblers are. I recommend BBDuk, which is fast and performs well.

Quote:
2) Metagenome assembly
a) Denovo assembler: MetaVelvet and Meta-IDBA, RAY, SOAP, Celera, Eular ( to compare all)
Q: how to check which assembler is giving better contigs. Is there any tool which checks for output of an assembler?
I suggest Megahit, which we currently use for all production metagenomic assemblies. You can use Quast to evaluate the output, though since you don't know the correct answer, it's of limited use. To quickly get continuity statistics, the BBMap package includes a tool called stats.sh which was specifically designed to scale with huge metagenomes of hundreds of gigabases (assembled). Usage: stats.sh in=contigs.fa

Quote:
b) Reference based assembly: MIRA, AMOS, Genometa
Q: I have no idea how it works(on contigs or reads or blast output)? Do we need to blast all the reads with all the bacterial references? Or these tools have their own reference database? Or to do it with bowtie against the reference? Or if someone have better suggestions?
There's typically no point in a reference-guided assembly of a metagenome...
Brian Bushnell is offline   Reply With Quote
Old 09-22-2015, 02:18 AM   #3
naman
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 8
Default

Hi,
Thanks for the reply!
At the moment i was trying with reference based assembly with MIRA. I read the manual and i believe providing strain parameter will align the reads to the references (please correct me if i am wrong).
But i have metagenomic data from mouse feces which can have multiple strains. So do I need to make my own input strain file or how does it works?
I know that reference based assembly may not be the best one for metagenomic data but still want to compare it with denovo one.
naman is offline   Reply With Quote
Old 10-01-2015, 05:03 AM   #4
naman
Junior Member
 
Location: Germany

Join Date: Sep 2012
Posts: 8
Default

Hi Again,
I have a question related to kmer optimization for meta-genomic assembler.
How is the kmer optimized? Since its not a single genome so how to decide which kmer to use.
Is there any tool or community standard to optimize kmer specific to meta-genomics? Currently i am running Ray and metavelvet with k21 to k63 and i am not understanding how to optimize kmer for my meta-genomic assembly which is for mouse feces.
Thanks and looking forward for reply!!
naman is offline   Reply With Quote
Old 11-18-2015, 12:41 PM   #5
lac302
Member
 
Location: DE

Join Date: Dec 2012
Posts: 57
Default

Unfortunately it will come down to trial and error for metagenomes. Each sample type will have a different optimal kmer based on species diversity.

You can use a program like kmergenie, but even the recommended optimal kmer may be off. It would be best to look through the histograms that are produced and pick a few that span a range.
lac302 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:20 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO