SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Ion Torrent PGM vs Illumina MiSeq ramujana Ion Torrent 122 11-13-2020 04:17 PM
comparison of Ion Torrent and Illumina MiSeq Ghada Introductions 2 04-25-2015 08:54 AM
Real datasets from Illumina and Ion Torrent ashishjwr Ion Torrent 3 11-20-2012 01:16 AM
Ion Torrent $1000 Genome!? Benchtop Ion Proton Sequencer aeonsim Ion Torrent 88 10-28-2012 05:50 AM
target enrichment for resequencing using Ion Torrent JWJH Ion Torrent 0 04-22-2012 05:38 PM

Reply
 
Thread Tools
Old 10-06-2015, 08:39 PM   #1
jerrybug109
Member
 
Location: USA

Join Date: Oct 2015
Posts: 10
Default NGS, coverage and read length appropriate for assembling a genome? (newbie here!)

Hello bioinformaticians!

First post here, sorry if it's not in the completely correct forum section. I'm completely new to bioinformatics and have just been assigned to a genomic dna sequencing + genome assembly project and would appreciate your advice for some basic questions I have!

We are conducting a population survey of ~90 strains of Bacillus Subtilis (genome size = 4 MB). We would like to do full genome sequencing on each of these strains. We already have many reference genomes sequenced and will use those as anchors.

We will purify DNA from each strain and will have 90 individual DNA samples. We want to send these DNA samples out to a university/company to be sequenced. Using that data, we want to assemble the genome of each of these strains with Velvet, SPAdes, or an alternative.

Right now, I have the responsibility of choosing where our DNA samples get sent out, what NGS platform to use, and what run specifications to use for our project.

My issue right now is that I'm not sure which next generation sequencing platform is suitable for our project if we intend to do this genome assembly as our goal. How do I pick between Illumina (Miseq, Hiseq, etc) vs. PacBio?

I also am unsure of what depth of coverage would be acceptable for our purposes; I've been told 10X should be good enough but that seems low to me - perhaps 20X would suffice?

I'm also not sure what read lengths would be appropriate - do you know if paired-end reads of 2x150, 2x250 would be good?

I'm from a different field so I have a lot to learn - I'd appreciate any pointers you have. Thanks!
jerrybug109 is offline   Reply With Quote
Old 10-06-2015, 11:33 PM   #2
luc
Senior Member
 
Location: US

Join Date: Dec 2010
Posts: 451
Default

What is a "full genome sequencing" for your PI?
I assume you want to find simple variants and are not interested in the repetitive genomic regions?
For genome assemblies you should be aiming for at least 100x genome coverage.

A Hiseq lane should give you enough for all samples. A paired-end 150 bp read lane (about 100 Gb) on a Hiseq 3000/4000 would generate the most data/dollar but other types of reads would work,too.

Last edited by luc; 10-10-2015 at 09:36 AM.
luc is offline   Reply With Quote
Old 10-08-2015, 11:36 AM   #3
jerrybug109
Member
 
Location: USA

Join Date: Oct 2015
Posts: 10
Default thoughts on using nanopore MinION to DNA sequence for the purpose of genome assembly?

So I've just discovered this new device called MinION from Oxford Nanopore: https://www.nanoporetech.com/

And it looks like a potentially cost-effective way to do DNA sequencing. At my lab, we are looking to DNA sequence and assemble the genomes (using genome assemblers like Velvet, SPAdes, etc) of 90 individual strains of Bacillus Subtilis (gram positive bacteria with a 4 MB genome). This is not De Novo assembly because we have many reference genomes available to resequence with.

Our lab is a bit budget-strapped and it seems like sequencing our 90 strains at an NGS is out of our budget. MinION looks like a cheaper way to sequence DNA. However, I'm unclear as to the efficacy of MinION and whether the results it produces would be good enough for genome assembly.

Does anyone have any experience or thoughts about using MinION instead of an NGS (like Illumina Miseq or Hiseq) to obtain DNA sequences for the intended downstream application of assembling genomes?

It looks like exciting tech but I don't know its reputation or whether it would be good for producing dna sequence results acceptable for assembling bacteria genome

Thanks!
jerrybug109 is offline   Reply With Quote
Old 10-08-2015, 12:38 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Multiplexing all 90 in a single Illumina HiSeq run would be by far the cheapest approach. The sequencing in that case would not really be very expensive, just the cost of making 90 libraries, which you would need to do regardless of platform. MinION is currently a very experimental platform with unpredictable results, and not suitable for a budget-strapped lab to use in an attempt to get quality assemblies.

You should target around 72 Gbp of sequence data (for Illumina), which you can get from a single HiSeq lane running 2x150bp reads. So, it's just the cost of one lane, really. Maybe $5000? Depends on the lab.

Go check out (or post in) the vendor forum if you want specific price quotes:
http://seqanswers.com/forums/forumdisplay.php?f=30

Last edited by Brian Bushnell; 10-08-2015 at 12:46 PM.
Brian Bushnell is offline   Reply With Quote
Old 10-08-2015, 12:52 PM   #5
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

Just buddy up with someone with a PacBio, it will rip through those microbial genomes far more effectively - if you care about the quality of the assembly.
Bukowski is offline   Reply With Quote
Old 10-09-2015, 05:40 AM   #6
maxsalm
Member
 
Location: London

Join Date: Feb 2015
Posts: 18
Default

Hi there!

I would take a look at this excellent review to help with your coverage requirements:
http://www.nature.com/nrg/journal/v1...l/nrg3642.html
maxsalm is offline   Reply With Quote
Old 10-09-2015, 06:49 AM   #7
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

My recommendation would be to get the device now and use the slow-to-arrive-but-free flow cells and reagents from ONT for testing out the device for your applications on single samples. There's an amplification-free protocol with 96 barcodes, but I expect it will be a few months before that can be used by the unwashed masses:

http://publications.nanoporetech.com...nd-promethion/

If you want things done "right now", and you've got money to burn, go with PacBio. If you can afford to wait a few months, it might be worth holding onto that money and running samples on Nanopore after fast mode becomes standard.
gringer is offline   Reply With Quote
Old 10-12-2015, 12:41 PM   #8
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

10x might be okay to find clonal mutations, but it's not sufficient for do novo assembly. 50x would be the minimum there.

Your holes in the assembly will be caused by long repetitive things like transposons, and changing read length from 150-250 isn't going to help that much. You want more reads, not more length.
swbarnes2 is offline   Reply With Quote
Old 10-13-2015, 05:00 AM   #9
WhatsOEver
Senior Member
 
Location: Germany

Join Date: Apr 2012
Posts: 215
Default

It depends a little bit on what you're up to. Larger structural variations might be as relevant as short/single polymorphisms. As you have a really small genome, I would recommend to go for an initial pacbio run which is "error-corrected" by a subsequent Illumina Hiseq (2x125bp will do in this scenario) run.
WhatsOEver is offline   Reply With Quote
Old 10-13-2015, 11:40 AM   #10
gringer
David Eccles (gringer)
 
Location: Wellington, New Zealand

Join Date: May 2011
Posts: 838
Default

How much money do you have to work with? What are you going to do with these data?

A PacBio run at 50x coverage should give you almost perfect results for a de-novo, but I suspect the cost will be a little too high. At 50x coverage that would be about 5 PacBio runs on their system that just recently came out.

Depending on the application, a 10x MiSeq run might be good enough for you. If you're only interested in mapping rather than assembly, then you can get away with fairly low coverage.
gringer is offline   Reply With Quote
Old 10-16-2015, 07:27 AM   #11
jerrybug109
Member
 
Location: USA

Join Date: Oct 2015
Posts: 10
Default Illumina vs Ion Torrent for resequencing genome?

Hi all,

I'm trying to resequence the genome of 30 strains of Bacillus Subtilis using good reference genomes (~4.2 MB size genome). We have extracted DNA from them and want to send them to an NGS service to have the samples library prepped, barcoded and sequenced. Afterwards we'd assemble the genomes back at our own lab.

Costwise and efficiency wise for genome assembly, do you guys know if there's any significant difference between the Illumina platforms and Ion Torrent? It seems that a best bet would be to go with Miseq (2x150) or (2x250), but I'm not really familiar with Ion Torrent and was wondering if they were less expensive?

Have a good day, thanks all!

Last edited by jerrybug109; 10-16-2015 at 07:48 AM.
jerrybug109 is offline   Reply With Quote
Old 10-16-2015, 08:43 AM   #12
Bukowski
Senior Member
 
Location: UK

Join Date: Jan 2010
Posts: 390
Default

Cross-posted to BioStars: https://www.biostars.org/p/162262/
Bukowski is offline   Reply With Quote
Old 10-16-2015, 10:23 AM   #13
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

It would be more effective if you updated your existing threads rather than continually creating new ones. I am combining these.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:45 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO