SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Nextera DNA sample preperation problem Derek Daly Sample Prep / Library Generation 4 06-27-2012 06:58 AM
chip-seq normalization when two sample total reads vary largely tujchl Bioinformatics 0 01-10-2012 03:00 AM
Sample/library prep of DNA and RNA in a metagenomic sample chrisaw01 Metagenomics 1 05-05-2011 01:59 PM
nextera niceday Sample Prep / Library Generation 5 09-10-2010 10:31 AM
PubMed: Comparison of Normalization Methods for Construction of Large Multiplex Ampli Newsbot! Literature Watch 0 04-27-2010 02:00 AM

Reply
 
Thread Tools
Old 09-18-2012, 01:47 PM   #1
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 64
Default Nextera XT multiplex sample normalization

Hi all,
Our lab just did our first Nextera XT run on a MiSeq. I'm just the informatics guy and didn't actually prep the libraries myself, but the person who did says the protocol is nicely straightforward compared to some earlier library protocols.

The resulting run, unfortunately, had a fairly wide range in sample abundances and I'm wondering if any of you have insight into why or what we might do differently to achieve uniform abundances? I've pasted a histogram of per-sample read counts below. Input material was normalized to 1ng/sample as measured with flourimetry (picogreen or qubit).


There is a 10-fold difference between the least abundant sample and the most abundant. My (very limited) understanding of the protocol is that normalization happens at the last stage of library prep where tagmented samples are loaded with magnetic beads and shaken at 1800rpm. A limited quantity of sample material is supposed to saturate the binding capacity of the beads in each tube. When the material is released from the beads one obtains about the same amount of material from each sample.

Does that sound right? Any ideas on what we could do differently to achieve better normalization? Are the results I have here considered "good" normalization?
koadman is offline   Reply With Quote
Old 09-18-2012, 11:37 PM   #2
MrGuy
Member
 
Location: earth

Join Date: Mar 2009
Posts: 68
Default

Surface capture normalization is based on the concept of overloading to normalize. For this to work, the beads capture X molecules. If you add >X, then you get normalized samples as the excess molecules are discarded in washing. If you add <X, then you get not so normalized samples as the beads capture "all" available molecules.

My guess is you are below X as you normalized to 1 ng prior to input. You are also gambling on the lot-to-lot reproducibility of the kit at that low of a range. What does the manufacturer recommend as input range?
MrGuy is offline   Reply With Quote
Old 09-19-2012, 05:53 AM   #3
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 64
Default

Thanks for the reply MrGuy, it hadn't occurred to me that we might be underloading some samples. The Nextera XT protocol had us tagment 1ng material per sample, then amplify that material with 12 cycles PCR before carrying out the normalization, so in principle there should have been much more than 1ng per sample going into normalization. However, the person making the libs reserved a portion of the PCR product so perhaps it's as simple as just reserving less next time.
koadman is offline   Reply With Quote
Old 09-24-2012, 04:51 PM   #4
ScottC
Senior Member
 
Location: Monash University, Melbourne, Australia.

Join Date: Jan 2008
Posts: 246
Default

I'm not saying that this is the solution to your problem, but a couple of things come to mind as possibilities:

I'd measure the amount of material in the PCR tube before and after cycling to make sure you're actually getting something produced from the PCR reaction and that the yield is good enough. Also measure the size distribution on a bioanalyzer after 'tagmentation' and/or PCR. Sometimes the size range with different samples is very large (i.e. libs with fragments down near 100b, others at 2kb) and this might affect the normalisation too; but, I haven't actually measured the effect of fragment size on normalisation...

Cheers,

Scott.
ScottC is offline   Reply With Quote
Old 09-25-2012, 03:42 PM   #5
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 64
Default

Thanks for the reply Scott. Based on your suggestion I decided to look at the relationship between median insert size in mapped reads and the number of clusters per sample. See the attached plot.

Based on this plot it seems like there might be a relationship, and a Pearson's correlation test seems to confirm it:
Quote:
> cor.test(abc$V1,abc$V2)

Pearson's product-moment correlation

data: abc$V1 and abc$V2
t = -2.3818, df = 13, p-value = 0.0332
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.82929338 -0.05423031
sample estimates:
cor
-0.5511812
We want to process hundreds of samples very quickly, so unfortunately doing a bioanalyzer per-sample is not really a possibility for us. We're thinking instead that we might reduce our estimated input per-sample to 0.5ng in the hope that this will create smaller insert size distributions. We're a bit concerned about the effect this might have on library complexity, it seems like there must be a tradeoff between input quantity and complexity. One further question: when you use lower input amounts, do you add extra PCR cycles to compensate?
koadman is offline   Reply With Quote
Old 09-25-2012, 06:43 PM   #6
ScottC
Senior Member
 
Location: Monash University, Melbourne, Australia.

Join Date: Jan 2008
Posts: 246
Default

Are you processing the same sample type, or is each one different? We're a service provision lab so each sample we process can be from a different organism, prepared in a different way by a different set of hands with different reagents... we're running a Bioanalyzer trace for each of them for the time being because we see such huge differences between the libraries.

If you're processing similar samples, I think you should be able to optimise things a bit and come up with a standard 'apparent' (using your method of quantitation) DNA concentration that will work well each time. When we first started using the XT preps and noticed such a wide variation in sizing and library yield, we did a set of 5 preps ranging from 0.2ng to 1.0ng to find an optimal mass... but, again, that only works if you're processing the same kinds of samples all the time (unfortuantely, we're not).

We ended up settling on 0.8ng for most preps. We don't increase the PCR cycle number because the yield was good enough and we wanted to avoid too many PCR duplicates.
ScottC is offline   Reply With Quote
Old 09-25-2012, 07:28 PM   #7
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 64
Default

That's a good point, we've noticed similar issues in the past too and the samples in the histogram I posted above are indeed from at least 4 different sample types, at least 4 different DNA extraction methods, and 2 quantitation methods (picogreen on plates read by TECAN and qubit). The samples in the x-y scatterplot (my 3rd post) are a subset of those in the histogram which are all from the same sample type, DNA extraction method, and quantitation method. The hundreds of samples we're processing will all be extracted & quantitated the same way, all the same sample type, so yes, we'll aim for an operational standard optimized for that sample type. Still, it's annoying that extraction method, sample type, and quantitation method seem to have such a strong influence on the quality of tnp-catalyzed preps.

Do you think adding extra PCR cycles would increase the PCR duplicate rate in resulting sequence data? I was under the impression that reducing starting material would have a stronger effect than adding a few extra cycles. PCR bias on GC content seems like it might be an issue with extra cycles.
koadman is offline   Reply With Quote
Old 10-02-2012, 04:32 PM   #8
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 64
Default

Hi all, following up to give this thread some closure. We remade libraries for the 16 samples from the xy-plot above (all same sample type, DNA extraction, quantitation) using 0.5ng input and 14 cycles of enrichment PCR. The cluster count was low so it seems our loading concentration was too low. However, the normalization looks much better now:



And Pearson's shows no correlation between insert size and read count. Thanks for the suggestions! Now just need to work out how to load these on the HiSeq!
koadman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:38 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO