SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Illumina/Solexa (http://seqanswers.com/forums/forumdisplay.php?f=6)
-   -   Duplicates percentage in target resequencing (http://seqanswers.com/forums/showthread.php?t=10630)

Seq84 04-08-2011 04:38 AM

Duplicates percentage in target resequencing
 
Hi,

We are performing a target resequencing experiment, where we enriched 8 regions.
  1. What's the mean % of duplicates in this kind of experiments ?
  2. The % in duplicates is casual or is affected by specifics factors?
  3. During quality control analysis we noticed that % in duplicates range from 1 to 80%, is it normal have high % of duplicates in a target resequencing experiment?

Heisman 04-08-2011 07:04 AM

A high duplicate percentage comes from (generally) too many PCR cycles pre-hybridization. For standard exomes we typically see < 5% PCR duplicates using Agilent SureSelect.

NGSfan 04-08-2011 07:10 AM

Quote:

Originally Posted by Heisman (Post 39104)
A high duplicate percentage comes from (generally) too many PCR cycles pre-hybridization. For standard exomes we typically see < 5% PCR duplicates using Agilent SureSelect.


5% is impressively low. Would the target size range have an affect? We are sampling just 500 genes and not a whole exome. I've noticed PCR duplicates from 14% to 50% in our data. I am trying to convince the wet lab people to lower their PCR cycles.

How many cycles are you doing? I think wetlab is doing 16 cycles..

Heisman 04-08-2011 07:30 AM

If we can start with 3ug DNA we can get away with 7 or fewer PCR cycles. The protocol states to do 4-6 but when I've done 8 cycles I haven't had any issues. 16 prior to hybridization seems quite high, though. I communicated with an Agilent rep who told me that duplicates are mainly caused by too many cycles pre-hyb.

Robby 04-29-2011 05:52 AM

Hi,
I think the number of reads and the target region size is important as well.

But nevertheless we have the duplication problem as well. We used the Agilent AllExon-Kit and sequenced with the HiSeq (one lane per sample). We observed a duplication rate of 40-50%, although we sticked to the protocol. We observe, that the duplication rate with the GA II is much lower than with the HiSeq. Does anyone observe the same?

How many reads did you map to which region size and which duplication rate did you observe?

@Heisman: Did you sequence with the HiSeq or the GA II? How many reads did you receive?

Heisman 04-29-2011 06:16 AM

@Robby: We sequence with the HiSeq and I get generally ~90-100 million reads. We map to the whole genome using novoalign but that shouldn't be a determining factor regarding how many duplicate reads there are. I have no idea why you are getting so many duplicates or, even more interestingly, why you would get less duplicates with the GAIIx. I've generally started with 3-4 ug DNA and done 7-8 PCR cycles pre-hybridization (around 12 after hybridization).


All times are GMT -8. The time now is 06:46 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.