SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Miseq custom primer design, primer linker Cyrus Taheri Illumina/Solexa 1 04-21-2015 05:25 PM
primer design razibus General 6 12-05-2014 05:04 AM
Help for bisulphite primer design jyuems Epigenetics 0 08-01-2014 05:24 AM
General Illumina Questions/Primer Design Help kayleighepps Illumina/Solexa 5 11-15-2013 02:47 PM
primer design cmccabe Bioinformatics 0 08-30-2013 07:07 AM

Reply
 
Thread Tools
Old 06-25-2016, 11:28 AM   #21
thermophile
Senior Member
 
Location: CT

Join Date: Apr 2015
Posts: 233
Default

I haven't worked with comm data out of jgi since 454 days, so was unaware they'd adopted staggered primers. To get *minimum* 10k seqs/sample (not average) i multiplex 150ish samples (plus either genomes, another amplicon, or phix) and aim to cluster around 750. Most samples get way more than 10k seqs, but very few end up with less.

I don't see how rarifying/subsampling to 10k would change statistical power for comm analysis because the unit being compared is sample. Statistical power is coming from number of samples rather than number of sequences. This is certainly true for any analysis that is distance matrix based. Even if you are doing some type of modeling, the power limitation is still number of samples rather than number of observations within each otu.

What kind of community analyses draw power from number of seqs?
thermophile is offline   Reply With Quote
Old 06-25-2016, 04:00 PM   #22
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

If you are interested in how a specific species' fraction of a community changes over time, or what that fraction is in a single sample, the quantity of data will improve your resolution. If you subsample down to 1 read, obviously, the results will be garbage. At 10 reads, they will be slightly better, but still garbage. At 1 billion reads, you will have extremely fine-grained resolution that allows very precise calculations. If you take your hypothetical 1 billion reads and subsample down to 10k... you lose all of that precision, and of course you also lose the ability to separate error clusters from real clusters in low-abundance species. What do you gain? Nothing, as far as I can tell, except that things will run faster. So if you are compute-limited, this kind of makes sense, even though the results of the research will be ... "limited". But since compute time is so much cheaper than sequencing, I really cannot imagine a scenario where it's a good idea.

Oh... to answer your last question, ALL analyses draw power from the number of sequences. That's kind of how statistics works, not to mention that bioinformatics in general is reliant on high redundancy to compensate for errors in sequencing.
Brian Bushnell is offline   Reply With Quote
Old 06-26-2016, 03:50 PM   #23
nucacidhunter
Senior Member
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,140
Default

Quote:
Originally Posted by Brian Bushnell View Post
I was unaware that they might increase bias; could you explain that?
Any primer with non-template sequence overhang in 5 end such as fusion primers, 5 variable length spacers or barcodes and primers with common overhang can bias amplification of some templates. This is due to overhang being partially or completely complementary to some templates which increases their binding strength and amplification and also to possible interactions between primer pairs used in PCR.
Community profiling with these primers usually involves using different primer pairs for each sample where template-specific sequences are shared while the overhang is not. So, it is essential to test that all primer pairs produce similar results when used with one input DNA (preferably a control community with known composition). This becomes even more crucial for time series where community structure is followed over time and different primer pairs might be used for a sample in different time points.
nucacidhunter is offline   Reply With Quote
Old 06-27-2016, 08:30 AM   #24
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

That makes sense. Thanks!
Brian Bushnell is offline   Reply With Quote
Old 06-27-2016, 09:18 AM   #25
thermophile
Senior Member
 
Location: CT

Join Date: Apr 2015
Posts: 233
Default

Rarefaction/subsamping isn't done to speed up computer processing but rather to compare communities at the same level of sampling effort. With these data there is no biological reason that one sample has 500k sequences and another has 10k, it's mostly to do with how evenly the libraries are pooled and a little due to PCR efficiency. Since we rarely sample the community to completion, the sample with more seqs will appear more diverse just based on sequencing effort.

Your example of the statistical analysis is typical change detection, that is not the basis of most community analyses.
thermophile is offline   Reply With Quote
Old 07-08-2016, 10:54 AM   #26
urchin
Member
 
Location: California

Join Date: Sep 2015
Posts: 20
Red face Ummm, so what should I actually do????

Thank you all for your contributions to this discussion. I was away and just got to read the followup comments - some of which I can't really fully follow since I have yet to complete a library and get data back. Luckily we will be getting help with the analysis.

Should I consider using 2 separate primer sets so I can use PE250, one for V3 and another for V4? If so, could I split up the amplicons evenly and get good data from one library? If I can afford to do 2 separate libraries, one for V3 and one for V4 should I just do that instead? How much more information - or perhaps I mean power for meaningful analysis- would this provide? We may have money for that, but I won't know for sure till next month. In a perfect world I'm sure this is the best answer to the problem, but we know the world isn't perfect.

It seems I should definitely avoid using PE300 since the chemistry is not working consistently yet! I'm just trying to figure out how to get the most out of what we can currently do! I'm going with the staggered primers and I need to order them very soon.
urchin is offline   Reply With Quote
Old 07-10-2016, 05:42 PM   #27
nucacidhunter
Senior Member
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,140
Default

Quote:
Originally Posted by urchin View Post
Should I consider using 2 separate primer sets so I can use PE250, one for V3 and another for V4? If so, could I split up the amplicons evenly and get good data from one library? If I can afford to do 2 separate libraries, one for V3 and one for V4 should I just do that instead? How much more information - or perhaps I mean power for meaningful analysis- would this provide? We may have money for that, but I won't know for sure till next month. In a perfect world I'm sure this is the best answer to the problem, but we know the world isn't perfect.

It seems I should definitely avoid using PE300 since the chemistry is not working consistently yet! I'm just trying to figure out how to get the most out of what we can currently do! I'm going with the staggered primers and I need to order them very soon.

Choice of variable region is dependent on the study and it is best to be based on current literature (even though they have some issues). You will not get identical results (OTU, taxonomic resolution) using different regions. But using amplicons covering two variable regions (V3-V4) will give higher resolution and more accurate results than using any of those single regions individually. If you are going to use V3-V4 regions then you have to use 300PE. This chemistry has shown improved output recently and there is option of doing more sequencing if the results are poor.
nucacidhunter is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:31 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO