Go Back   SEQanswers > Applications Forums > Sample Prep / Library Generation

Similar Threads
Thread Thread Starter Forum Replies Last Post
Strange higher size amplicon in ChIP seq Library!! Chiper Epigenetics 5 02-18-2014 03:02 AM
strange Illumina txt format m_elena_bioinfo Bioinformatics 11 10-03-2013 09:15 AM
Strange peaks in TruSeq RNA library TonyBrooks Sample Prep / Library Generation 5 08-26-2012 03:24 AM
Small RNA sequencing library: Help! Very Strange PCR products MarineMan Illumina/Solexa 1 12-23-2010 05:31 AM
Strange higher size amplicon in ChIP seq Library!! Chiper Sample Prep / Library Generation 5 07-10-2010 07:43 AM

Thread Tools
Old 10-26-2011, 12:50 PM   #1
Senior Member
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default Strange Illumina library artifact.

We ran some "RAD" (Restriction site Associated DNA) tag libraries on our Illumina HiScanSQ. The libraries were not constructed by us, but the protocol identifies them as:

Adapted from “Sequenced RAD Markers for Rapid SNP Discovery and Genetic Mapping”, Paul D. Etter (University of Oregon) and modified by Michael R. Miller (University of Oregon)
Briefly genomic DNA is digested with a restriction enzyme (SbfI, an 8-cutter that generates a 4 base 3' overhang). "F" adapters are ligated to the ends of the digested DNA utilizing the 3' overhang. Then the DNA is sonicated to a reasonable size, end-polished, A-tailed then ligated to the "R" adapter. Enrichment PCR, then gel size selection.

The "F" adapter contains a 5 bp, in-line, index -- read prior to the remnant of the SbfI site. Since these are Illumina libraries it is important to get those 1st 5 bases randomized, so by pooling libraries that is accomplished.

Now the artifact. See below:

Lanes 2 and 3 show this artifact, whereas it is absent or much diminished in lanes 1 and 4. Note that the 6 remaining bases of the SbfI site are clearly visible in lanes 1 and 4, whereas the sequence upstream and downstream is nicely randomized. However, only about 1/2 the reads appear to share the SbfI site in lanes 2 and 3. The marauding sequence is identical to the 20 nt "reverse" primer used during enrichment PCR, CGTATGCCGTCTTCTGCTTG. So that would make it, what, a flow cell oligo?

Of note: these samples cut from an agarose gel. They were 600-750 bp. But upon denaturation and running on an Agilent pico RNA chip, modest amounts of smaller fragments are visible. (See figure above.) The inset has a zoomed-in view of the putative culprits. Of note is that they are not present at high concentrations compared to the main peak. But nevertheless, in two of the samples (1083 and 1084 in lanes 2 and 3), look to be consuming about 1/2 of the sequence being generated.

An earlier run had larger amounts of these and was nearly ruined by them. qPCR gave low estimates of their molar concentration (because of their low molecular weight) and so the lanes were tending towards over-clustering. Worse, the dimer predominated enough that it appeared to be interfering in cluster registration -- leading to low pass filter percentages.

Just an FYI, really. Primer dimers, once they show up during enrichment PCR can be the devil to get rid of. They can anneal to the main library amplicons and thwart double-stranded size selection.

pmiguel is offline   Reply With Quote
Old 11-07-2011, 07:12 PM   #2
Location: USA

Join Date: Jul 2010
Posts: 58

Hi Phillip,

We've been experiencing this issue too when we did RNAseq (lower input sequencing). We can clearly read primer dimers in base-pair composition display. But a bit surprise that you did "cutting out" your fragments.

I also reckoned that although there's relatively lower concentration primer-dimer, it would affect the run significantly - so, I simply thought lower but smaler fragments has higher molarity thereby resulting in more severe than I expected. After this issue, we regularly perform dissociation analysis in qPCR to find whether samples include primer dimers or not. Sometimes, they are not shown in bioA clearly or even if one or two bioA wells miss them, there would be no way to sort them out.
sehrrot is offline   Reply With Quote
Old 11-08-2011, 05:41 AM   #3
Senior Member
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317

A couple of points:

(1) Just to be clear, the main issues is that primer dimers share substantial amounts of sequence with the adapters of your library molecules. So one strand of a primer dimer molecule can anneal to the adapter of a good amplicon after a denaturation step. Once annealed to a longer molecule, they won't be removed by cutting a band of the correct size out of a gel: they will be included in the "correct size" fraction. Just a down-side to running DNA double-stranded. Nor will they be visible to dsDNA assays, probably.

(2) Do dissociation assays really allow you to detect primer dimers? I thought that assay was designed for more traditional qPCR experiments where a single gene is being amplified. A single gene would have a particular dissociation profile and should produce a single peak.

However libraries comprise a vast variety of molecules and, one could imagine situations in which they would be bimodal.

pmiguel is offline   Reply With Quote
Old 05-08-2016, 06:50 AM   #4
Junior Member
Location: New York

Join Date: Nov 2011
Posts: 4

I think we are having a similar problem. The library size distribution looks great on the Tapestation and then we end up sequencing lots of the PCR primer dimers (or short inserts in general). Did you ever track down the cause of this?
* Too high concentration of input primers?
* Too many rounds of PCR (overamplified)?
* Too little input to the enrichment PCR?

Any solutions that you have come up with? Thanks!
ipeikon is offline   Reply With Quote
Old 05-08-2016, 11:41 PM   #5
Jafar Jabbari
Location: Melbourne

Join Date: Jan 2013
Posts: 1,238

Bioanalyser trace indicates presence of more primer/adapter-dimers (fragments <200 bp) for samples in lanes 2 and 3 and accordingly they have more clusters with dimers. They also mask the restriction site overhang sequences for those samples which are obvious for samples in lanes 1 and 4. High percentage of A residue in lanes 2 and 3 is another indicator that short fragments have run into flow cell oligo lawn. General wavy shape of sequencers in all lanes is indicator of low diversity.

@ipeikon, If you use D1000 ScreenTape you may not see the low concentration of dimers, try running them on HSD1000. Unused primers should be cleaned properly after PCR. If there is a large amount of dimers then two clean ups should reduce numbers significantly. To find a specific reason for your libraries issue more information is needed.
nucacidhunter is offline   Reply With Quote
Old 05-09-2016, 02:10 PM   #6
Registered Vendor
Location: Eugene, OR

Join Date: May 2013
Posts: 521

If libraries are overamplified and the primer concentration drops then you can get interactions between adapters that make the fragment analyzer results look like there are not short fragments. One way to check is to take an aliquot of your PCR product and do a new PCR reaction for a cycle and check that product on the fragment analyzer as well. If you now see short fragments then primer exhaustion plus adapter interactions is the reason and you need to cut back on the cycle number or do a quick second round of PCR.
Providing nextRAD genotyping and PacBio sequencing services.
SNPsaurus is offline   Reply With Quote

illumina, primer dimer, rad sequencing, sav

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 08:37 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO