Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa

Similar Threads
Thread Thread Starter Forum Replies Last Post
T7 or or e.coli ligase for TA adapter Ligation. docphil Sample Prep / Library Generation 1 02-10-2015 11:41 PM
adapter ligation - Illumina Celia Sample Prep / Library Generation 5 03-29-2011 11:06 AM
Adapter Ligation Question johnmillsbro Illumina/Solexa 3 07-13-2010 08:09 AM
Ratio in Adapter-Ligation mestro2 Sample Prep / Library Generation 0 05-27-2010 03:29 AM
help: time of adapter ligation lvxiaobao Sample Prep / Library Generation 1 12-24-2009 06:37 AM

Thread Tools
Old 09-09-2017, 11:48 PM   #1
Junior Member
Location: Shanghai

Join Date: Sep 2017
Posts: 1
Default Confused about read duplicates after adapter ligation

Recently I am working on developing new methods for NGS PE library preparation. I am confused about PCR duplicates. It is obvious that duplicates will arise during PCR amplification dealing with low DNA input. How ever, when I think about PCR-free library preparation, one double-stranded DNA molecule is ligated 2 Y adapters at both left and right side. This results in 2 different single strand products:
1. 5'-P5 - plus strand insert ssDNA -P7'-3'
2. 3'-P7'- minus strand insert ssDNA -P5-5'

These 2 single strand products are actually duplicates since the insert ssDNA are fully complementary to each other and they could be both ligated to flow cell. This means even for PCR-free library, 50% of the reads are duplicates, theoretically both strand from one ds-DNA could be sequenced. however, when I dealt with fastq file after sequencing, the percentage of duplicates was much lower than 50%, as for PCR-free library, there was nearly no duplicates. Can anybody help me about this ?
hujian5241 is offline   Reply With Quote
Old 09-11-2017, 12:08 PM   #2
Senior Member
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,281

One read pair will be the reverse-complement of the other read-pair. So they will not be considered "duplicates".
Also, even if your software did consider these to be "duplicates" not all amplicon strands cluster. Various flowcells have different efficiencies.

If you cluster about 100ul of a 20pM library in one lane of a HiSeq 2500 you would get about 150-220 million pass filter clusters. To a first approximation "pM" is millions of molecules/ul. So about 2 billion molecules of your library would go into a HiSeq 2500 lane. So only about 20% actually cluster. So your chance of clustering both strands from a given amplicon would be 4%.

I think most other Illumina instruments cluster at a lower efficiency than than the HiSeq 2500.

pmiguel is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 07:55 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO