![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Bioanalyzer: Multiple Peaks after library PCR | HarryHaller89 | Sample Prep / Library Generation | 10 | 06-21-2016 12:50 AM |
Disagreement between bioanalyzer and agarose gel for PCR amplicon library | mbirnb | Sample Prep / Library Generation | 4 | 08-03-2015 10:28 AM |
Large MW seen in mRNA bioanalyzer trace | ts1127 | Sample Prep / Library Generation | 1 | 07-03-2013 07:25 AM |
Ideas on Bioanalyzer trace for ChIP-seq library? | orlatron | Sample Prep / Library Generation | 0 | 05-14-2012 11:02 PM |
Ladder-like bioanalyzer trace on supposed 150bp DNA sample library | Ace5858 | Illumina/Solexa | 1 | 03-31-2012 02:39 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: San Francisco, Ca Join Date: Mar 2013
Posts: 9
|
![]()
Hello,
I am working on a deep sequencing protocol for a PCR amplicon (~650bp) using the TruSeq DNA PCR-Free Sample Preparation Kit and I am seeing extra peaks in my final bioanalyzer traces that concern me because I don't know what they might come from. Peaks on the trace: +Small peak at size of original insert +Medium-sized peak that I think might correspond to insert+1adapter +Large peak that I think might correspond to insert+2 adapters -mini peak to the right of the "insert" peak -mini peak to the right of the "insert+1adapter" peak -mini peak to the right of the "insert+2adapters" peak Does anyone know what the last three peaks might be? I am attaching both the TapeStation on the initial PCR material and the bioanalyzer on the final library. Thanks! Dan |
![]() |
![]() |
![]() |
#2 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
Illumina adapters are about each 60 bases long. But you may be right. Illumina kit adapters are "Y"-adapters with only about 10 bp of doublestranded DNA and the rest (~50 bases) as double single-stranded tails. Single stranded molecules tend to migrate slower on Agilent chips than corresponding length double stranded molecules. So those Y-adapters may be introducing some drag. But that would be a lot of drag from a few hundred bases of double ssDNA. Seems unlikely to me. Another hypothesis would be that the "1539 bp" fragment is migrating slowly because the ligase is still attached to the amplicon. Another possibility (this is the one I like) is that the "901 bp" fragment has both adapters ligated and is running only a little larger than its expected because of the Y-adapters: 656+120=776bp double-stranded length . The 1539 bp fragment would have a double insert, 656+656+120=1432 which would fit pretty well. That would suggest the A-tailing step did not work well -- left a substantial percentage of the ends blunt. You posted this question long ago, so you could probably update us on your results. But if you used this library, it probably worked okay. For reasons unclear to me, short amplicons seem to cluster vastly better than longer amplicons. So your data set would remain fairly free from chimerics. -- Phillip |
|
![]() |
![]() |
![]() |
#3 | |
Jafar Jabbari
Location: Melbourne Join Date: Jan 2013
Posts: 1,238
|
![]() Quote:
Last edited by nucacidhunter; 05-29-2014 at 06:18 PM. |
|
![]() |
![]() |
![]() |
#4 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
So, whatever the explanation, it needs to account for this. Which leads one to think there must be some sort of competition among amplicons. Something that would allow the shorter amplicons to displace the longer ones and prevent them from creating clusters. That way, if the shorter amplicons are removed, the longer amplicons can form good clusters. However I can't think of a reasonable mechanism of competition. So maybe something else is going on? -- Phillip |
|
![]() |
![]() |
![]() |
#5 | ||
Jafar Jabbari
Location: Melbourne Join Date: Jan 2013
Posts: 1,238
|
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#6 | ||
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
Not sure what is happening with your actual amplicon libraries. But in our case, we just used the concentration based on qPCR and nailed the cluster density. Quote:
-- Phillip |
||
![]() |
![]() |
![]() |
#7 | |
Jafar Jabbari
Location: Melbourne Join Date: Jan 2013
Posts: 1,238
|
![]() Quote:
Last edited by nucacidhunter; 06-02-2014 at 07:25 AM. |
|
![]() |
![]() |
![]() |
#8 | ||
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
You write: Quote:
-- Phillip |
||
![]() |
![]() |
![]() |
#9 |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() |
![]() |
![]() |
![]() |
#10 | ||
Jafar Jabbari
Location: Melbourne Join Date: Jan 2013
Posts: 1,238
|
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#11 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
Here, as a sign of my respect, is a more fleshed out explanation of why I think what I will call the "RTA-mediated" hypothesis of why shorter amplicons of a given library predominate in Illlumina data sets is not sufficient to explain the actual phenomenon. I have a particular set of data that makes me think this RTA-mediated hypothesis is not sufficient to explain what is going on. Here is a link to the full thread. But to summarize, we made a "large insert" TruSeq DNA library but used extra/more stringent Ampures to remove shorter fragments. Did 4 cycles of PCR on it (instead of the protocol's recommended 10) and clustered at 4pM (rather than what was normal on the MiSeq at the time -- 8pM). Here is an Agilent chip of the library we clustered: Again, we nailed the cluster density using our normal KAPA qPCR calculation using, if I recall correctly, the modal peak size depicted above (1892bp) in the calculation specified in the KAPA kit manual. That would include 120bp of adapters, so think of the inserts as being a modal size of 1772bp, or a little less due some distortion due to DNA mass being assayed by the agilent chip rather than DNA count. However the result of the run when mapped back to a reference genome with BWA produced pair-end insert length as depicted here: Okay, one might argue that the lower graph is on a linear scale and represents counts of DNA molecules whereas the top graph is mass based and displayed in the more-or-less log-linear scale that one typically sees from electrophoresis. Again, in the previous thread, I exported the data from the Agilent chip and transformed it so it would be on the same scale as the lower chart so they could be directly compared: So, it still comports fairly well with the early statement "modal size of 1772bp", or a little less. Certainly no lower than 1600bp. Keeping that in mind, the RTA-mediated hypothesis fails to explain our hitting cluster density exactly while shifting the size distribution of what was sequenced lower by about 500 bp. If fails because were that the case, the loss of clusters from 1.1 to 1.6 kb and above should have decreased the total number of read pairs. That is, these longer amplicon clusters should have been there physically, but just not detected by RTA. So our effective (RTA-calculated) cluster density should have been much lower than what we calculated using qPCR. But it wasn't. I don't actually think that the short amplicons are displacing the longer ones from the flowcell during clusters. I think something else, something unknown, is going on. Okay, that is supposition also. But I need some mechanism that allows qPCR to accurately quantitate cluster density for a pool of long amplicons -- that is what I see. As with all physical phenomena there are plenty of explanations that might explain what I describe above. But I see no reason at all to favor the RTA-mediated explanation for which, other than unsubstantiated claims from Illumina, there is no evidence for. See what I am saying here? The RTA-mediated explanation is just a story. May have been invented whole-cloth by someone at Illumina and came to be propagated as dogma without any particular evidence. Stuff like that happens all the time. Just because it is superficially reasonable, doesn't mean it is true. -- Phillip |
|
![]() |
![]() |
![]() |
#12 | ||
Jafar Jabbari
Location: Melbourne Join Date: Jan 2013
Posts: 1,238
|
![]() Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#13 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
-- Phillip |
|
![]() |
![]() |
![]() |
#14 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
Also, wasn't it necessary to remove the SYBR green, etc. from the qPCR reaction prior to running the chip? What method did you use? -- Phillip |
|
![]() |
![]() |
![]() |
#15 | |
Jafar Jabbari
Location: Melbourne Join Date: Jan 2013
Posts: 1,238
|
![]()
The question is why short amplicons sequence better than large ones. The evidence is that when a library with broad size distribution is sequenced, after mapping reads, one finds that average size or peak of mapped fragments is smaller than input library indicating preferential sequencing of smaller library fragments. This was not important in earlier days when the libraries were size-selected in a narrow range and multiplexing was not very wide spread. But since introduction of gel free library prep kits (bead based size-selection resulting in libraries with wide distribution of fragment sizes), wide spread use of transposon mediated broad library preps and increased output of platforms it has become more important. When pooling libraries with different insert sizes for sequencing this should be taken into account to obtain desired proportionate number of reads from each library.
My answer as suggested in this thread is “RTA-mediated hypothesis”. Short fragments are more efficient in forming clusters because during bridge amplification it is more likely for polymerase to synthesis a full complementary strand (end to end) for short fragments than large ones due to limited extension time (15 sec in MiSeq). During template generation (early 4-5 cycles) RTA uses signal intensities from images and calls bases from normalised (taking colour cross-talk and phasing correction into account) intensities. Raw data are filtered to remove reads that do not meet signal purity threshold, overlapping and low intensity clusters. At this step in a population of small and large fragment clusters, small ones would have higher intensity (it is proportional to strand number resulting from amplification efficiency) and therefore are preferentially detected and their base composition is called. But large fragments because of less efficient amplification will have less intensity and would not be favoured by RTA. Of course, in a flow cell lane with larger fragments most of the clusters would have less intensity if compared to a lane with predominantly small fragments. But because RTA detection of clusters is relative (normalised intensity not raw), they still are detected and bases are called. The argument against this is evidence from a large library sequencing (1-5 Kb) in which qPCR predicted cluster density was achieved. I have two arguments against this. Firstly, cluster density and library input are not linear. For example, if 12 pM input gives 800K cluster /mm of a flow cell lane, 8 pM input will not result in 600k cluster. Secondly, quantifying large fragments with KAPA qPCR is not accurate because the standards are 400 bp and their amplification efficiency would be more than large fragments in 1-5 kb range as in this case. In addition, if extension time is not increased significantly, large fragments will drop and only a small portion of library will be amplified and quantified resulting in underestimation of quantity. The attachment in this post is ScreenTape profiles from input library and output from the qPCR reaction showing preferential amplification of smaller fragments during PCR. The qPCR reactions were purified using 1.8x AMPure beads to remove salts, polymerase, SYBR and nucleotides. Quote:
Last edited by nucacidhunter; 06-06-2014 at 04:15 AM. |
|
![]() |
![]() |
![]() |
#16 |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]()
Okay, I accept that as actual evidence. I still don't buy it for explaining the whole phenomenon. But at this point that is little more than hand waving. I'm unlikely to pony up the time and resources to do further testing, so this is as far as it goes, probably...
-- Phillip |
![]() |
![]() |
![]() |
Tags |
emulsion pcr, pcr amplicons, pcr-free, truseq, truseq dna |
Thread Tools | |
|
|