Seqanswers Leaderboard Ad

**yzzhang** · 02-23-2015, 05:21 AM

For your assembly, you meant you pulled all data together and then assembled using velvet? I think this paper may be a good reference

**Zapages** · 02-23-2015, 07:11 AM

My suggestion is to try slew of different assemblers and different k-mer values.

In terms of genome assembly, the larger the k-mer the better the assembly as you are reducing redundancy in the sequence and increase coverage.

Now at the same time, it all depends on how much depth of coverage you want. If you have smaller k-mer value, then you will have more depth of coverage, but at the same time higher chance of mis-assembly of the reads.

So to over come this, I would suggest then taking all your assembled reads from the different k-mers and try to re-assemble them to form "super contigs".

This will help you achieve greater coverage (or reduce coverage to the specific genome by removing the same region that has been assembled multiple times) and reduce and bias that each assembler that might have.

Hope this helps a bit.

-Zapages

**guptavipin142** · 02-24-2015, 02:19 AM

Yes....
thanks for great help!!!!!!

**fanli** · 02-24-2015, 08:03 AM

IMO there are several newer assemblers out that typically outperform velvet in terms of both accuracy and speed. I typically use SPAdes for bacterial de novo assembly:

SPAdes Genome Assembler | Algorithmic Biology Lab

http://bioinf.spbau.ru/spades

see these as well for reference:

Page not available - PMC

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702249/

GABenchToB: A Genome Assembly Benchmark Tuned on Bacteria and Benchtop Sequencers

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0107014

De novo genome assembly is the process of reconstructing a complete genomic sequence from countless small sequencing reads. Due to the complexity of this task, numerous genome assemblers have been developed to cope with different requirements and the different kinds of data provided by sequencers within the fast evolving field of next-generation sequencing technologies. In particular, the recently introduced generation of benchtop sequencers, like Illumina's MiSeq and Ion Torrent's Personal Genome Machine (PGM), popularized the easy, fast, and cheap sequencing of bacterial organisms to a broad range of academic and clinical institutions. With a strong pragmatic focus, here, we give a novel insight into the line of assembly evaluation surveys as we benchmark popular de novo genome assemblers based on bacterial data generated by benchtop sequencers. Therefore, single-library assemblies were generated, assembled, and compared to each other by metrics describing assembly contiguity and accuracy, and also by practice-oriented criteria as for instance computing time. In addition, we extensively analyzed the effect of the depth of coverage on the genome assemblies within reasonable ranges and the k-mer optimization problem of de Bruijn Graph assemblers. Our results show that, although both MiSeq and PGM allow for good genome assemblies, they require different approaches. They not only pair with different assembler types, but also affect assemblies differently regarding the depth of coverage where oversampling can become problematic. Assemblies vary greatly with respect to contiguity and accuracy but also by the requirement on the computing power. Consequently, no assembler can be rated best for all preconditions. Instead, the given kind of data, the demands on assembly quality, and the available computing infrastructure determines which assembler suits best. The data sets, scripts and all additional information needed to replicate our results are freely available at ftp://ftp.cebitec.uni-bielefeld.de/pub/GABenchToB.

**lac302** · 02-24-2015, 01:06 PM

I've actually used Minia for metagenome assembly with good success.

Minia
http://minia.genouest.org/

Determining an optimal kmer size for a metagenome is tough. My suggestion would be to try several.

KmerGenie
http://arxiv.org/pdf/1304.5665.pdf

**guptavipin142** · 02-25-2015, 10:16 PM

Originally posted by fanli View Post

IMO there are several newer assemblers out that typically outperform velvet in terms of both accuracy and speed. I typically use SPAdes for bacterial de novo assembly:

SPAdes Genome Assembler | Algorithmic Biology Lab

http://bioinf.spbau.ru/spades

see these as well for reference:

Page not available - PMC

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3702249/

http://journals.plos.org/plosone/art...l.pone.0107014

Hi fanli,
All these seems to be genome assemblers and trained on genomics data I think.

**Brian Bushnell** · 02-25-2015, 10:23 PM

I have to say, Spades is slow even on a single microbe; I doubt you could run it on 800Gbp of metagenomic reads.

We had been using Soap and sometimes Ray for our metagenomes, but now are using Megahit which is faster and uses less memory than Soap.

also how do I proceed to assembly normalization

Can you clarify? I have written a normalization program to reduce high-depth reads prior to assembly, but I'm not sure that's what you are looking for.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Microbiome assembly and its normlization

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News