Unconfigured Ad

**rglover** · 06-01-2010, 01:15 PM

We do this routinely on our metagenomics datasets - we're using the CLCbio de novo assembler although we're sequencing with a 454 rather than a shorter read technology. Paper here if you want some more details.

**cliffbeall** · 06-28-2010, 11:13 AM

Sorry to be late in answering, but there is a recent study that did assembly on the gut metagenome with Illumina:

Qin et al, Nature 464, 59-65 A human gut microbial gene catalogue established by metagenomic sequencing.

The scale of the sequencing they did is pretty enormous, 576.7 gigabases, not sure if the methodology is applicable to smaller scales.

**Suzanne** · 07-15-2010, 02:18 PM

PANGEA for next gen data assembly

Hi- you might want to check out this paper too:

PANGEA: pipeline for analysis of next generation amplicons

The ISME Journal 4, 852-861 (July 2010) | doi:10.1038/ismej.2010.16

Adriana Giongo, David B Crabb, Austin G Davis-Richardson, Diane Chauliac, Jennifer M Mobberley, Kelsey A Gano, Nabanita Mukherjee, George Casella, Luiz FW Roesch, Brandon Walts, Alberto Riva, Gary King and Eric W Triplett

303 See Other

http://www.nature.com/ismej/journal/v4/n7/full/ismej201016a.html

**greigite** · 07-15-2010, 04:03 PM

PANGEA is for amplicon sequencing only though- a somewhat different problem not requiring assembly. There are various of these amplicon analysis pipelines and they all have variants of the following steps: quality filtering, read trimming, clustering, BLAST, taxonomic assignment, comparison of relative abundances across data sets. They don't involve assembly, though I guess if you were doing Illumina PE amplicon sequencing you'd have to join the two overlapping ends to get a single sequence.

Originally posted by Suzanne View Post

Hi- you might want to check out this paper too:

PANGEA: pipeline for analysis of next generation amplicons

The ISME Journal 4, 852-861 (July 2010) | doi:10.1038/ismej.2010.16

Adriana Giongo, David B Crabb, Austin G Davis-Richardson, Diane Chauliac, Jennifer M Mobberley, Kelsey A Gano, Nabanita Mukherjee, George Casella, Luiz FW Roesch, Brandon Walts, Alberto Riva, Gary King and Eric W Triplett

http://www.nature.com/ismej/journal/...ej201016a.html

**Thomieh** · 08-10-2010, 12:55 AM

Check these:
- Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. Dutilh BE, Huynen MA, Strous M., Bioinformatics 2009 vol. 25 (21) pp. 2878-81
- The metagenome of a biogas-producing microbial community of a production-scale biogas plant fermenter analysed by the 454-pyrosequencing technology. Schlüter et al.,
Journal of biotechnology, 2008 vol. 136 (1-2) pp. 77-90.

Metagenomes can be hard to assemble, but this really depends on the diversity found in the ecosystem you are studying. In studies discussing simple ecosystems where only a few species are found it is possible to do this. In more complex environments it really depends on how deep your sequencing is. Most metagenomes of complex environments don't achieve a coverage of more than 1 and then assembly is not easy.

And when you do an assembly, the question still remains, how can you test that what you have assembled is really valid. For that you need to be able to amplify the fragment using PCR and resequence it.

**saul** · 08-12-2010, 09:19 AM

I've recently been playing with the MetaHit data from the paper cited above, assembling the data using the CLC Bio de novo assembler -- see http://www.clcdenovo.com/. Using gross statistics (total assembled bp in contigs > 200bp, average contig size > 200bp, N50, etc), my assemblies were better than those reported in the paper using Soap de novo, and required a fraction of the time/cpu/memory of SOAP. In fact, I could run the assemblies in 8Gb of memory. Give it a try!

**barabara** · 05-23-2011, 06:13 AM

Originally posted by saul View Post

I've recently been playing with the MetaHit data from the paper cited above, assembling the data using the CLC Bio de novo assembler -- see http://www.clcdenovo.com/. Using gross statistics (total assembled bp in contigs > 200bp, average contig size > 200bp, N50, etc), my assemblies were better than those reported in the paper using Soap de novo, and required a fraction of the time/cpu/memory of SOAP. In fact, I could run the assemblies in 8Gb of memory. Give it a try!

Hi Saul,
can you send me more details how you did your analysis of the MetaHit data with CLCBio? I would like to try to analyze my metagenomic samples and I am beginner who would like to learn

Thank you!

**raw937** · 06-12-2011, 12:36 PM

META-Velvet!!!

Check it out or Abyss is also good!

**ucpete** · 09-01-2011, 04:09 PM

Price

This paper hasn't been published yet, but this is hands down the best solution for metagenomic assembly I've used:

http://derisilab.ucsf.edu/software/price/index.html

It's still being updated every couple weeks...

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 40 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 47 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

Assembly of short reads in Metagenomic studies

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News