SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Genome Res De novo bacterial genome sequencing: millions of very short reads assembly b_seite Literature Watch 1 10-04-2017 11:26 PM
what are acceptable N50 values for a bacterial genome (assembly using A5) CherylM Illumina/Solexa 3 11-07-2013 09:43 AM
Bacterial Genome Assembly Webinar DNASTAR Vendor Forum 0 11-20-2012 08:05 AM
MiSeq Bacterial Genome Multiplexing eosin Illumina/Solexa 4 03-28-2012 05:02 PM
454 paired end bacterial whole genome assembly pmiguel Bioinformatics 15 03-11-2010 04:50 AM

Reply
 
Thread Tools
Old 12-09-2013, 11:49 PM   #1
Etherella
Member
 
Location: Moscow

Join Date: Aug 2012
Posts: 20
Default bacterial genome assembly on Miseq

Hello!
I am planning to make a Nextera XT library prep (a regular bacterial genome, a couple of megabases), sequence it with 300 cycles kit(2x150 bp), and then assemble. As far as I understand miseq has inbuilt software that does everything (including velvet assembly for small genomes) automatically. do I need to launch the process after the sequencing or is everything done automatically?
Etherella is offline   Reply With Quote
Old 12-10-2013, 12:59 AM   #2
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

Everything should be done automatically if you set it up on your sample sheet. The assembly is carried out in BaseSpace. Although, if you go into the Run Options screen on your MiSeq, you have the ability to replicate the analysis locally - which you might want to do if not using BaseSpace.
We've found the Velvet assembly on MiSeq to be rather hit and miss in terms of quality, ranging from acceptable to very poor. We got much better assemblies using an OLC assembler than we ever got with Velvet.
TonyBrooks is offline   Reply With Quote
Old 12-10-2013, 04:41 AM   #3
mcnelson.phd
Senior Member
 
Location: Connecticut

Join Date: Jul 2011
Posts: 162
Default

I'll echo Tony's statements about assemblies coming straight off the MiSeq as being very hit or miss. For one genome we once got a 2.8Mbp contig that was nearly perfect out of a 3.5Mbp genome, but we've also gotten assemblies with N50s of 2Kbp and no contigs larger than 50Kbp. A large part of the problem is that the data doesn't appear to be pre-processed in any way to trim off low quality regions or look for PCR duplicates.

I'd suggest setting up your run to produce the assembly, but also do the work yourself to compare. Most likely you'll find that you can do a much better job and be glad you didn't just rely on the system to give you an assembly.
mcnelson.phd is offline   Reply With Quote
Old 12-10-2013, 05:08 AM   #4
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

We once got an N50s less than the read-length (251PE).

<AssemblyStatistics>
<NumberOfContigs>59444</NumberOfContigs>
<MeanContigLength>56.10188</MeanContigLength>
<MedianContigLength>46</MedianContigLength>
<MinContigLength>31</MinContigLength>
<MaxContigLength>560</MaxContigLength>
<BaseCount>3334920</BaseCount>
<N50>62</N50>
</AssemblyStatistics>

No idea what was going on there. All quality stats suggested it was good sequencing (12m reads from v2 500 cycle with 93% >Q30). We assembled the data offline without problems. N50's went up to 150kb.

If you want to use velvet, Nick Loman has a good guide about how to pre-process
http://pathogenomics.bham.ac.uk/blog...nome-assembly/
TonyBrooks is offline   Reply With Quote
Old 12-10-2013, 10:20 PM   #5
Etherella
Member
 
Location: Moscow

Join Date: Aug 2012
Posts: 20
Default

Thanks a lot! Well, anyaway Miseq stores unaligned fastaq data, so I will be able to have a look at the automatic assembly and then, if the quality is lacking try other software or run Velvet again but with pre-process.
Etherella is offline   Reply With Quote
Old 12-13-2013, 07:58 AM   #6
nucleus
Junior Member
 
Location: Albany,ny

Join Date: May 2013
Posts: 7
Default

Ive had trouble with Velvet before. The issue was not the read quality but rather the sequencing that was too deep (>~50x Velvet falls apart). I am now almost exclusively using Spades (http://bioinf.spbau.ru/spades/) which does the read corrections and assembly on one go, and dosent mind very deep coverage. Spades also gives me better results than CLC.

Last edited by nucleus; 12-13-2013 at 08:01 AM.
nucleus is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:57 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO