SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
amplicon sequencing on MiSeq nancysch Illumina/Solexa 20 03-11-2015 11:29 AM
V4 Region Amplification and Sequencing via MiSeq ramsemm Illumina/Solexa 55 12-18-2014 01:53 AM
Exome sequencing arvi8689 Genomic Resequencing 8 11-12-2012 02:24 AM
PCR duplicate removal for whole genome sequencing vs. whole exome sequencing cliff Bioinformatics 1 09-27-2011 08:29 AM
Hands-on ngs workshop - human exome sequencing and microbial whole genome sequencing vikram Events / Conferences 0 12-08-2010 09:36 PM

Reply
 
Thread Tools
Old 10-31-2011, 03:50 AM   #1
isildur
Junior Member
 
Location: Cyprus

Join Date: Sep 2010
Posts: 5
Default miSeq and Exome sequencing

Hi all,
we are looking to buy our first NGS machine for our lab to use from the beginning of 2012 and I am in charge of recommending a suitable machine for our needs. We work with human genome and cytogenetics in my lab.
We have a limited budget and our idea is to get Illumina' s MiSeq (we are also considering Roche GS Junior System but we are leaning towards miSeq).

As I come from a computer scientist background and its the first time I venture into the NGS machines world (before i was only in the analysis part) I have a few questions and would be grateful if anyone could answer or direct me to somewhere to read:

1. One of the aplications we will want to do is exome sequencing. I have read that miseq is not able to do whole genome sequencing unless we are talking about a very small genome. What about the exome? Can I use the TruSeq Exome Enrichment kit with the DNA sample preparation kit and prepare out libraries and sequence on MiSeq?

2. I read in this forum that for 1 GB MiSeq can produce 2x 150 bp reads and that for a 50Mb exome capture that would be 20x coverage. Can someone explain to me how these numbers are calculated? How do I know, given a capture library of X Mb and say I want to sequence paired end reads of 150 bp each, how much coverage I will get? Also, If i want to have say at least an average of 50x coverage in each position of the exome, how do i calculate how much data i need and the length of reads i should put miSeq to produce for best outcome? Could you direct me to a publication or something I can read to clear these notions and numbers in my head because i need to understand them to be able to assess NGS machines?

3.Can I create libraries of differents parts of the exome to be resequenced by miSeq so I can have a better coverage at each position?

4. Does anyone know the cost of miSeq machine and how much a single run will cost?

Thank you in advance
isildur is offline   Reply With Quote
Old 10-31-2011, 07:37 AM   #2
Heisman
Senior Member
 
Location: St. Louis

Join Date: Dec 2010
Posts: 535
Default

The MiSeq is actually a lot more expensive than a HiSeq in terms of per-base cost. I don't know what the cost of the instrument is relative to the HiSeq. If it's similar or you plan to do a lot of sequencing, it might be better to get the HiSeq. The real advantage of the MiSeq is that turn around time (1 day vs. ~2 weeks).

In terms of coverage:

50 Mb exome x 50x coverage = 2.5 GB of sequence. So if you did 2x150 PE reads, you would need (2.5x(10^9))/300 = 8.33 million reads. HOWEVER, with exome sequencing you will not get every base pair on target and you will not get completely even coverage of the whole exome. Three reasons for this:

1. The average human exon I think is 125 bp or so. If you have a 2x150 paired end read, that is 300 bp of sequence for an exon that is only 125 bp. Right there you are losing over half of your data.

2. The way whole exome capture works means some of your reads will align to random places on the genome. You can probably assume 65% of your reads will align to the exome, maybe more.

3. For other technical reasons, you won't get even coverage of each exon. So you might need to sequence to an average depth of 70x or so to really get almost all of the bases at least 50x.

Bottom line: the MiSeq is not going to give you enough reads to analyze an exome without running the same sample multiple times, and it's a lot more expensive per base than the HiSeq, so I would definitely look into this more.
Heisman is offline   Reply With Quote
Old 10-31-2011, 08:12 AM   #3
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

Quote:
Originally Posted by Heisman View Post
The MiSeq is actually a lot more expensive than a HiSeq in terms of per-base cost. I don't know what the cost of the instrument is relative to the HiSeq. If it's similar or you plan to do a lot of sequencing, it might be better to get the HiSeq. The real advantage of the MiSeq is that turn around time (1 day vs. ~2 weeks).
The MiSeq is also a much cheaper piece of equipment. It depends whether you have a large enough capital budget to purchase a HiSeq. Also, the MiSeq is very easy to run, requires no additional equipment (i.e. cBot or analysis server) and has analysis software on board. The MiSeq has been designed for targeted capture and resequencing, 16s metagenomics and small genomes. For anything else, I would recommend forging connections with centres that have HiSeqs.
GSJuniors are a bit cheaper to buy, but relatively very expensive to run (per base). I would also consider the IonTorrent as they are rapidly improving the technology, scaling up read number and increasing read-length all the time.
It's cheaper than MiSeq to buy, cheaper to run and will hopefully hit 10m 400bp (modal) reads within a year.


Quote:
Originally Posted by Heisman View Post
50 Mb exome x 50x coverage = 2.5 GB of sequence. So if you did 2x150 PE reads, you would need (2.5x(10^9))/300 = 8.33 million reads. HOWEVER, with exome sequencing you will not get every base pair on target and you will not get completely even coverage of the whole exome. Three reasons for this:

1. The average human exon I think is 125 bp or so. If you have a 2x150 paired end read, that is 300 bp of sequence for an exon that is only 125 bp. Right there you are losing over half of your data.

2. The way whole exome capture works means some of your reads will align to random places on the genome. You can probably assume 65% of your reads will align to the exome, maybe more.

3. For other technical reasons, you won't get even coverage of each exon. So you might need to sequence to an average depth of 70x or so to really get almost all of the bases at least 50x.

Bottom line: the MiSeq is not going to give you enough reads to analyze an exome without running the same sample multiple times, and it's a lot more expensive per base than the HiSeq, so I would definitely look into this more.
Agreed here. Purely on a cost basis, it's a very inefficient way to sequence an exome. If it was absolutely necessary, there's no reason why it can't be done. - it's just time consuming and expensive to do. Remember, you generally need >40X coverage to correctly call SNPs, the Illumina Exome kits enrich 62Mb plus enrichment only runs at 65-70% - all this means you'd probably need to do at least three runs per exome.

UK (list) Price of the MiSeq is just under 85k. I'm not 100% on prices, but I think our rep stated it'd be about 500 per run (I assume that's the 50bp kit).
TonyBrooks is offline   Reply With Quote
Old 10-31-2011, 02:29 PM   #4
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

Illumina is heavily discounting the MiSeq reagents for new customers. Your price per run is over twice the discount price.
NextGenSeq is offline   Reply With Quote
Old 10-31-2011, 08:18 PM   #5
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default

Thought I'd chime in here with some real-world results from our MiSeqs where it may aid this discussion.

Right out of the box, we're getting 8-9 million 2x150 reads (at cluster densities of ~1100K/mm^2), translating to ~2.5GB. List price on the reagents/flowcell is pretty close to $950 I believe (for the 300 cycle kit), but there are discounts a fair amount lower.

The GSJr and PGM are not even close to competitive in this space (cost, throughput, or ease of use), unless you have armies of people to keep them fed, AND you absolutely need longer than 150bp reads (GSJr).
ECO is offline   Reply With Quote
Old 11-01-2011, 01:16 AM   #6
isildur
Junior Member
 
Location: Cyprus

Join Date: Sep 2010
Posts: 5
Default

Thank you all for your answers and helpful explanations. All very useful information. I will look into Ion Torrent as well but I like the ease of use that miSeq will provide and according to my group we will rarely require whole exome sequencing but we will target smaller genome areas for resequencing that are associated with syndromes and diseases. Thus, I will inquire about the precise lengths of areas of interests, but I believe a single run of miSeq will suffice.
Also fot miseq users, do you happen to know if the Agilent or Nimblegen exome enrichment kits are compatible with miSeq or I shoud stick with Illumina's Truseq?
If you happen to have in mind a helpful publication about library preparation and designing an NGS run experiment, please share. Thanks
isildur is offline   Reply With Quote
Old 05-21-2012, 06:29 PM   #7
ScottC
Senior Member
 
Location: Monash University, Melbourne, Australia.

Join Date: Jan 2008
Posts: 246
Default

I know this is an old thread, but for those that come across it looking for pricing estimates... be aware that Illumina's pricing for instruments and reagents vary by a large amount across the world. Often, USA and UK prices are not applicable in other countries. The pricing is not necessarily linked to exchange rate either... Our list price for the 300b kit is more than 25% higher than your quoted US prices, even though our dollar is at parity (or better).

Cheers,

Scott.
ScottC is offline   Reply With Quote
Old 04-04-2013, 08:33 PM   #8
ymc
Senior Member
 
Location: Hong Kong

Join Date: Mar 2010
Posts: 498
Default

Now MiSeq can do 2x150bp at 4.5-5.1Gb. Does that mean now it can do exome in one run?
ymc is offline   Reply With Quote
Old 08-05-2013, 12:25 PM   #9
epistatic
Senior Member
 
Location: Dronning Maud Land

Join Date: Mar 2009
Posts: 129
Default

no, see the description above regarding exon size. Our exome libraries are usually 150-200 bp insert, run PE-76. Take the yield you get from MiSeq at the shorter PE, not PE-150 or PE-250. These are good for small genomes or targeted studies but not yet for whole exome.

I had been told to expect MiSeq V3 to be able to do an exome but now wouldn't be surprised if I believe the rumor mill of a 3rd instrument coming out that fits the niche between MiSeq and HiSeq to compete with Proton. If you could have a MidSeq that just ran the two-lane rapid flow cell and not the HT mode, like a slimmed down HiSeq 1500 Rapid only, that would be the logical Dx instrument over MiSeq. Whole exome for germline and 200 gene Foundation Med sized panels for Cancer studies are most common.
epistatic is offline   Reply With Quote
Old 07-15-2014, 11:56 AM   #10
a.obeidat
Junior Member
 
Location: somewhere

Join Date: Mar 2013
Posts: 2
Default

Hello everyone, sorry to necro an old thread..

One thing I don't understand that is why Miseq cant do exome sequencing? when you are saying exome sequencing, are you referring to all exomes in the human genome or a single or 2 exomes in a certain gene.

Are the limitation we are talking about in terms of costs only? or in terms of technical issues? library preparation issues? what exactly??

Lets assume I want to sequence a certain exome in gene (X), cant I design primer flank that exome and "resequence" it via Miseq?? of course I can?
a.obeidat is offline   Reply With Quote
Old 07-15-2014, 01:36 PM   #11
AllSeq
Registered Vendor
 
Location: San Diego, CA

Join Date: Oct 2013
Posts: 138
Default

Quote:
Originally Posted by a.obeidat View Post
Hello everyone, sorry to necro an old thread..

One thing I don't understand that is why Miseq cant do exome sequencing? when you are saying exome sequencing, are you referring to all exomes in the human genome or a single or 2 exomes in a certain gene.

Are the limitation we are talking about in terms of costs only? or in terms of technical issues? library preparation issues? what exactly??

Lets assume I want to sequence a certain exome in gene (X), cant I design primer flank that exome and "resequence" it via Miseq?? of course I can?
You seem to be confusing two related terms - exons and exomes. Exons are the coding regions of genes while exomes are all of the exons present in a genome. Put another way, the exome is the protein-coding portion of the genome.

MiSeq doesn't have any problem sequencing exons or exomes. The issue is that a single run doesn't have quite enough coverage for a full human exome. This is because (as stated above), exons are relatively small, and the increased output from MiSeqs has primarily come in the form of longer reads. To get good coverage on a MiSeq, you might have to run two chips instead of one. That's why the HiSeq and now the NextSeq are probably better choices from Illumina. If you prefer Ion Torrent, the Proton P1 would be the way to go.

If you're interested, we have summaries of the various sequencing platforms and list out which applications each is best suited for on our NGS Knowledge Bank.
__________________
AllSeq - The Sequencing Marketplace
info@AllSeq.com
www.AllSeq.com
AllSeq is offline   Reply With Quote
Old 07-15-2014, 02:07 PM   #12
a.obeidat
Junior Member
 
Location: somewhere

Join Date: Mar 2013
Posts: 2
Default

AllSeq,

Thanks for the correction, I was half asleep when I wrote that post

Regarding the coverage, cant you increase the depth of the sequencing and still be within the limit of the 15GB output?

According to post #2 even if you did 150x you will get 7.5GB output of data, that is 25 million reads if using 2x150 (not sure what is the maximum reads for Miseq flow cell)

if the above is plausible wont be 150x coverage enough to align your exomes and call your SNPs with confidence.

Sorry in advance if I am talking rubbish but I am sort of new to this

Last edited by a.obeidat; 07-15-2014 at 02:22 PM.
a.obeidat is offline   Reply With Quote
Old 07-15-2014, 02:51 PM   #13
AllSeq
Registered Vendor
 
Location: San Diego, CA

Join Date: Oct 2013
Posts: 138
Default

It's because the 2X150 reads don't double your coverage in this case. If the insert size were 300b or more, it would be fine. However, human exons are only about 150b long, so the 2X150 read would just read the exact same molecule twice (once from either end). You could use that info to bump up the read quality a bit (by checking each read against the other), but you can't use it to increase the read depth. Most exomes are sequenced to ~100X coverage to look for rare variants (i.e., variants in a small subpopulation of the cells used to prepare the library). Reading the same exact molecule twice doesn't help you look for rare events. I hope that explanation helps a bit.
__________________
AllSeq - The Sequencing Marketplace
info@AllSeq.com
www.AllSeq.com
AllSeq is offline   Reply With Quote
Old 07-18-2014, 04:22 AM   #14
Zaag
Senior Member
 
Location: Amsterdam

Join Date: Nov 2009
Posts: 112
Default

But longer reads give a better distribution of coverage so you might get to the same % of bases covered 30x with less average coverage. Of course this does not help you find mosaicisms (my definition of a rare variant would be a variant found once in a 1000 or whatever people) as AllSeq already explained.
Zaag is offline   Reply With Quote
Old 03-24-2015, 07:29 AM   #15
NGS newb
Junior Member
 
Location: Outer Heaven

Join Date: Mar 2015
Posts: 2
Red face

Thanks for this great thread. I am new to this NGS so execuse me for my ignorance.

I have some questions. As indicated above one of the main problems for exome sequencing is the relatively small exon size (on avarege 125bp). But if we used Miseq v3 kit 150 cycle, that is 2x75 paired end (if I understood that correctly) then we will not lose alot of data because we are not sequncing more than the insert size. Problem one checked, right?

Since we are using v3 kits then I dont think so the 15GB will be a problem.

Second issue from above discussions is due to not getting enough distrubution or coverage for all exons. And the solution for that on Miseq to run that sample multiple times to get enough coverage for all exon to call variants with confidence. Now my question is what do the NextSeq and HiSeq instrument have extra to give me a better distribution across the exome (assuming no output and reads limitations on the MiSeq system)? I also read somewhere that on HiSeq you can run the sample twice on the same flow cell? is this the reason its better, or something else? Not really sure about the Nextseq flow cell configuration, your input here will be helpful.

A possible counter for the above issue (if I understood it correctly) in a technote by illumina "Optimizing Coverage for Targeted Resequencing" they were explaining about coverage and enrichment and gave an example (page4, you might have to see it to understand my logic below):

lets assume I want 100x mean coverage, that is 20x desired coverage/ 0.2 mean normalized coverage; 20/0.2

Now for the total amount of sequnencing required, that is 62 MB total targeted bases X 100x the mean coverage/ 0.65 the enrichent efficiency which equal around 9.5 GB (less than 15GB miseq maximum output); (62)x(100)/0.65

Does that make sense or I am just talking rubbish ??

If the above is correct, does not this save you from running the exome more than once?

Thanks in advance
NGS newb is offline   Reply With Quote
Old 03-24-2015, 09:37 AM   #16
AllSeq
Registered Vendor
 
Location: San Diego, CA

Join Date: Oct 2013
Posts: 138
Default

Quote:
Originally Posted by NGS newb View Post
But if we used Miseq v3 kit 150 cycle, that is 2x75 paired end (if I understood that correctly) then we will not lose alot of data because we are not sequncing more than the insert size. Problem one checked, right?

Since we are using v3 kits then I dont think so the 15GB will be a problem.
If you cut the read length down to 2x75, you won't get the full 15Gb. The 15Gb calculation comes from 25M reads * 2X300bp reads. 2X75bp reads will yield only ~3.8Gb. And that's not really enough to fully cover an exome (especially if a portion of those reads are off-target).
__________________
AllSeq - The Sequencing Marketplace
info@AllSeq.com
www.AllSeq.com
AllSeq is offline   Reply With Quote
Reply

Tags
exome sequencing, miseq, newbie

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO