Hello Everyone, I am trying to plan for a large experiment of 300 human stool samples over the next 1 month to identify the population of bacterial species living in each stool sample (using 1 MiSeq machine). Based on time and cost, should I go with MiSeq or the 454 sequencer, especially if I have to do 300 samples again per month for the next few months? Thanks so much for your insights. I'll update this post every time I receive more feedback and information. Thanks!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Well, this is a pretty massive question, but in our lab at UC Davis we have both 454 and MiSeq and we don't use the 454 for projects like this anymore. Everything from sample prep to analysis is easier on the miseq. No denoising and with careful design you can avoid chimeric amplicon issues too. Never tried the cloud analysis with MiSeq. We use custom primers so not sure whether that would work...
-
Hi vs92,
We've been playing around with both the 16S V4 protocol by Caporaso et al. as well as an expanded V4/5 version using a pseudo 2x250 setup for a few months now. Based on our current experience with the MiSeq compared to doing 454, I'd say the MiSeq is the better choice.
Both 454 and the MiSeq would necessitate a lot of custom indexed primers, PCR, and cleanup, but the MiSeq has the advantage of not requiring an emPCR step and much easier set up (although to be fair I haven't actually done a 454 run in over 2 years so it may have gotten better). The MiSeq also has virtually no homopolymer issue, which means you can proceed with your data analysis much faster without having to do a computationally expensive denoising step. Max throughput on 454 is ~600K reads, while even with a 50% spike of phiX, which is necessary for amplicon sequencing on the MiSeq, you should still get > 2 million reads/run. Given that a 300 cycle kit from Illumina costs ~$1000, you have a drastically reduced cost/sample compared to the 454.
Now, one active topic of debate is how well the short reads from the MiSeq are able to capture your community compared to 454. My feeling is that it's a bit of a moot point since neither are 100% accurate and have their associated error sources. Given the higher throughput and drastically reduced cost/sample, I expect a lot of people to give up on 454 and switch to Illumina. With the imminent release of 500 cycles kits capable of doing 2x250 bp reads, combined with read pair merging, you'll soon be getting high quality ~400bp 16S amplicons that will completely supplant 454.
Comment
-
Originally posted by capsicum View PostCan anyone share some run statistics for 16S runs on the MiSeq? What cluster density are you aiming for? What PhiX spike-in proportion are you using? What cluster density, sequence yield and sequence quality are you getting back for these runs?
Post-upgrade, it's a whole different ball game. We've been able to get good data when using 90% phiX, but as you can imagine the yield is terrible. Cost is still around $100/sample for ~75K reads based on our results, which is better than 454. There have been a number of "hacks" using hard-coded run parameters that keep the software issue from destroying run quality, but they're not supported by Illumina and our only attempt to try it ended with an instrument failure so we're currently waiting to try again.
One way to get around the current software issue is to sequence a metagenome/transcriptome along with your amplicons so you're not wasting reads on phiX. That adds costs in having to prepare those libraries, and the data generally isn't as useful compared to a HiSeq run because of the shallow coverage, but considering phiX gives you nothing it's a worthwhile step in my opinion.
Comment
-
So things have gotten worse since the hardware upgarde? What has caused this... hardware or software? It sounds like just a software issue (well, perhaps a methodological issue and that methodology is implemented in the software). But, why has it gotten worse with the upgrade?
Is the issue solely caused by the poor colour matrix and phasing estimates (assuming cluster identification and image registration are OK)?
Lastly, do you mean that you now have to use 90% instead of the 20-60% that I've seen mentioned before?
Comment
-
So things have gotten worse since the hardware upgarde?
What has caused this... hardware or software? It sounds like just a software issue (well, perhaps a methodological issue and that methodology is implemented in the software). But, why has it gotten worse with the upgrade?
Is the issue solely caused by the poor colour matrix and phasing estimates (assuming cluster identification and image registration are OK)?
Lastly, do you mean that you now have to use 90% instead of the 20-60% that I've seen mentioned before?
So far Illumina has been very good at working with my group at figuring out how to work around this RTA issue, but it's not easy and there are a lot of big labs that this is really causing issues for.
One thing I have heard is that this issue does not affect that HiSeq at all. You don't get the 2x250 read lengths, and it's a lot higher investment, but it does work with only 40% phiX from the people I've talked with who've tried it.Last edited by mcnelson.phd; 10-22-2012, 05:23 PM.
Comment
-
Originally posted by mcnelson.phd View Post
The quality really shouldn't be that bad, but you can't trust the data at all when that happens.
With the 2 x 250 bp reads we get a significant overlap in the middle of the reads (without any errors in majority of reads). So we if set the scores aside there appears to be no problem with the sequence itself. At least in the case we are looking at (16S multiplexed, no phiX because of custom primer, hardcoded matrix/phasing).Last edited by GenoMax; 10-23-2012, 04:57 AM.
Comment
-
Originally posted by GenoMax View PostWith the 2 x 250 bp reads we get a significant overlap in the middle of the reads (without any errors in majority of reads). So we if set the scores aside there appears to be no problem with the sequence itself. At least in the case we are looking at (16S multiplexed, no phiX because of custom primer, hardcoded matrix/phasing).
Also, how are you getting away with no phiX at all? Are you doing multiple different V-regions of the 16S so that the cluster recognition isn't affected? That's something that we are considering, but it still seems risky to not use any phiX (I'd at least use a 1% spike as a sequencing control like Illumina recommends).
Comment
-
Originally posted by mcnelson.phd View PostCan you give some metrics on what percent are overlapping and how much overlap you're using? I never looked into seeing if the sequences were good but the quality was wrong because our FAS said that with the RTA error they can't make any guarantees about basecalling being accurate. I have looked at the phiX from poor runs, and do see a lot more base errors than one should normally see.
The "poor" runs you are referring to are those based on # of reads passing filter or quality scores?
Originally posted by mcnelson.phd View PostAlso, how are you getting away with no phiX at all? Are you doing multiple different V-regions of the 16S so that the cluster recognition isn't affected? That's something that we are considering, but it still seems risky to not use any phiX (I'd at least use a 1% spike as a sequencing control like Illumina recommends).
Comment
-
Our lab was getting ready to do a MiSeq run (16s 2x250) and I had some questions as well- I planned on using barcoded primers (12bp golay), but allowing the sequencing center to index our reads (A and B tags)- is this doable? I would imagine that in post processing I should be able to overlap the reads, strip the barcodes and send it through QIIME without having to order primers similar to Caparaso et. al in which they had very large primers with Illumina adaptor/index/spacer/barcode/primer (Which look to be very, very expensive as opposed to Barcode/Spacer/Primer, then allowing our center to prep the libraries to add adaptors and indicies as necessary).
Comment
-
GenoMax:
There will always be reads that are genuinely of poor quality that need to be trimmed or discarded, even in a 'good' run. If it's true that the sequence is actually OK, but the quality scores are just incorrect/miscalculated, and you then discard this information, then what do you do about downstream processing? If you're using this data for 16S tag sequencing, then how do pre-process the reads?
PS: Perhaps you already know about this, and/or perhaps your system also precludes the use of the Illumina sequencing primer. If not, then you can usually use PhiX, even in a custom-primed run, by simply adding the custom primer to the existing primer tube on the MiSeq cartridge, rather than one of the custom tubes. Then you're doing a sequencing reaction using several different primers at once and only the relevant primers will bind to the relevant clusters (the MiSeq cartridge already contains lots of different primers, anyway). But, maybe you don't need it.
Our lab was getting ready to do a MiSeq run (16s 2x250) and I had some questions as well- I planned on using barcoded primers (12bp golay), but allowing the sequencing center to index our reads (A and B tags)- is this doable? I would imagine that in post processing I should be able to overlap the reads, strip the barcodes and send it through QIIME without having to order primers similar to Caparaso et. al in which they had very large primers with Illumina adaptor/index/spacer/barcode/primer (Which look to be very, very expensive as opposed to Barcode/Spacer/Primer, then allowing our center to prep the libraries to add adaptors and indicies as necessary).
Comment
-
Originally posted by mcnelson.phd View PostPost-upgrade, it's a whole different ball game. We've been able to get good data when using 90% phiX, but as you can imagine the yield is terrible. Cost is still around $100/sample for ~75K reads based on our results, which is better than 454. There have been a number of "hacks" using hard-coded run parameters that keep the software issue from destroying run quality, but they're not supported by Illumina and our only attempt to try it ended with an instrument failure so we're currently waiting to try again.
You must be overloading. We run 5pM and 30% PhiX.
Comment
-
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 05-02-2024, 08:06 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
05-02-2024, 08:06 AM
|
||
Started by seqadmin, 04-30-2024, 12:17 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
04-30-2024, 12:17 PM
|
||
Started by seqadmin, 04-29-2024, 10:49 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
04-29-2024, 10:49 AM
|
||
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
Comment