SEQanswers (
-   SOLiD (
-   -   PCR duplicates increase when excess of beads (

tdm 03-30-2011 11:37 AM

PCR duplicates increase when excess of beads
We are experiencing an increasing of PCR duplicates.
Could this be realated to an increasing of the quantity of beads in order to get more reads?

pmiguel 03-30-2011 11:44 AM

Hi Thibault,
Could you give us a little more information to go on? What type of library? How many cycles of PCR did you do during library construction. (I don't me ePCR, I mean the number of PCR cycles used during various library construction steps.
Finally, how many beads/panel are you depositing?

tdm 03-30-2011 12:55 PM

Hi PMiguel,
I have the data you ask for:
Type of library: SureSelect All Human Exon Capture (50Mb)
Sequencing: Fragment (50bp), barcoded samples, 6 samples/slide
How many cycles of PCR: As low as possible. Ususally 10 cycles in pre and post capture PCR. But in that case, we had to do 12 cycles for both pre and post capture PCR.
How many beads/panel: ~280,000 beads/panel

Thank you

pmiguel 03-30-2011 01:41 PM

Hi Thibault,
The only circumstances under which going to higher densities of beads would result in high % of PCR duplications would be:
(1) There are some high intensity beads bleeding signal across their neighbors--hence duplicate sequences. You could actually check this if your informatic skills were mighty enough. The read names are coordinates--so if your PCR duplicates frequently are seen spatially adjacent, then you would infer that you have a signal bleed over issue.
(2) Your library is bottomed out. That is you started with 0.1 pg of original template with an average size (say) of 100 bp--well that is 1 million DNA fragments. If you do a sequencing run and produce 100 million reads from the library that you made from the initial 1 million DNA fragments you are guaranteed to have an average of 100 PCR duplicates of each. The above example is extreme. Unlikely you would bottom-out a library that much. But that is the idea. If, at any point in your protocol, you bottleneck the complexity of your library below the number of reads you ultimately produce, then you will end up with PCR duplicates.
(3) Amplicon contamination. If any of your ePCR amplicons end up in your library construction area, they will contaminate subsequent libraries you construct. This isn't directly related to numbers of beads. But if you have a low level of amplicon contamination in a library you might not really notice it until you went deeper in the library.


tdm 03-30-2011 01:56 PM

I was already thinking about the point number 2.
I will check about distance between the duplicate of same fragment.
Do you know if picard argument OPTICAL_DUPLICATE_PIXEL_DISTANCE work with SOLID data?

Thank you Phillip

pmiguel 03-31-2011 05:14 AM

I do not know. But the read name itself is the bead's co-ordinate. The first field is the panel number. So the 2nd and 3rd are probably planar co-ordinates (Cartesian?). Does anyone know? I vaguely remember reading this in a SOLiD manual at some point. But I am not seeing it now.


tdm 03-31-2011 05:19 AM

I should be able to figure something with coordinates in color cfasta file.
thank you

niceday 03-31-2011 08:15 AM

Can I also point to the amount of PCR cycles?
We have been discussing the same problem yesterday.
12 cycles after library production, then size selection then enrichment. This is already producing duplicates.
After enrichment there are usually another 8 to 12 cycles?

We are looking at going into enrichment without PCR and size selection. Not sure if we need to order our own adapters as I believe the agilent adapters for SOLiD enrichment are truncated.

This is how we do our illumina enrichments and we don't get the same duplication problem.

Today we tried 6 cycles of PCR after enrichment giving us a total of 18 overall. We will check the sequence data as it comes off.

pmiguel 03-31-2011 08:42 AM

Yes you are talking about a "bottomed out" library. You start with a certain amount of DNA. For SureSelect, I don't know how much. But rule of thumb: 1 ug of DNA fragmented to average 1 kb size comprises a trillion (10^12) molecules. Probably you are closer to 10 trillion initially because fragmentation would be down to closer to 100 bp?

Then you have a series of steps that winnows the total number of real fragments down. Each is a bottle-neck. How narrow is the bottle-neck? Often this is hard to assess--especially when PCR is being deployed. 10 cycles of PCR potentially can amplify the amount of DNA you have 1000-fold. That is, picograms become nanograms. Or, one million molecules becomes one billion. But, obviously the billion will all be copies of the original million. Yes, something to be concerned about.


tdm 03-31-2011 09:02 AM

Hi niceday

Originally Posted by niceday (Post 38468)
We are looking at going into enrichment without PCR and size selection. Not sure if we need to order our own adapters as I believe the agilent adapters for SOLiD enrichment are truncated.

With SOLID protocol, it is not possible to go into enrichment without PCR because adapters are truncated and we need the nick translation step to attach correctly the adpaters.
thank you for your help.

pmiguel 03-31-2011 09:48 AM

Technically, you could do a single cycle of PCR to "step-out" your adapters to their full length. That can result in no PCR duplication. Personally, I would do at least 3 cycles. And would only go that low with trepidation. Lots of issues that don't matter at ng/ul concentrations do matter at pg/ul. (DNA binding levels of plastics, for instance.)


All times are GMT -8. The time now is 09:50 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.