We ran some "RAD" (Restriction site Associated DNA) tag libraries on our Illumina HiScanSQ. The libraries were not constructed by us, but the protocol identifies them as:
Briefly genomic DNA is digested with a restriction enzyme (SbfI, an 8-cutter that generates a 4 base 3' overhang). "F" adapters are ligated to the ends of the digested DNA utilizing the 3' overhang. Then the DNA is sonicated to a reasonable size, end-polished, A-tailed then ligated to the "R" adapter. Enrichment PCR, then gel size selection.
The "F" adapter contains a 5 bp, in-line, index -- read prior to the remnant of the SbfI site. Since these are Illumina libraries it is important to get those 1st 5 bases randomized, so by pooling libraries that is accomplished.
Now the artifact. See below:
Lanes 2 and 3 show this artifact, whereas it is absent or much diminished in lanes 1 and 4. Note that the 6 remaining bases of the SbfI site are clearly visible in lanes 1 and 4, whereas the sequence upstream and downstream is nicely randomized. However, only about 1/2 the reads appear to share the SbfI site in lanes 2 and 3. The marauding sequence is identical to the 20 nt "reverse" primer used during enrichment PCR, CGTATGCCGTCTTCTGCTTG. So that would make it, what, a flow cell oligo?
Of note: these samples cut from an agarose gel. They were 600-750 bp. But upon denaturation and running on an Agilent pico RNA chip, modest amounts of smaller fragments are visible. (See figure above.) The inset has a zoomed-in view of the putative culprits. Of note is that they are not present at high concentrations compared to the main peak. But nevertheless, in two of the samples (1083 and 1084 in lanes 2 and 3), look to be consuming about 1/2 of the sequence being generated.
An earlier run had larger amounts of these and was nearly ruined by them. qPCR gave low estimates of their molar concentration (because of their low molecular weight) and so the lanes were tending towards over-clustering. Worse, the dimer predominated enough that it appeared to be interfering in cluster registration -- leading to low pass filter percentages.
Just an FYI, really. Primer dimers, once they show up during enrichment PCR can be the devil to get rid of. They can anneal to the main library amplicons and thwart double-stranded size selection.
--
Phillip
Adapted from “Sequenced RAD Markers for Rapid SNP Discovery and Genetic Mapping”, Paul D. Etter (University of Oregon) and modified by Michael R. Miller (University of Oregon)
The "F" adapter contains a 5 bp, in-line, index -- read prior to the remnant of the SbfI site. Since these are Illumina libraries it is important to get those 1st 5 bases randomized, so by pooling libraries that is accomplished.
Now the artifact. See below:
Lanes 2 and 3 show this artifact, whereas it is absent or much diminished in lanes 1 and 4. Note that the 6 remaining bases of the SbfI site are clearly visible in lanes 1 and 4, whereas the sequence upstream and downstream is nicely randomized. However, only about 1/2 the reads appear to share the SbfI site in lanes 2 and 3. The marauding sequence is identical to the 20 nt "reverse" primer used during enrichment PCR, CGTATGCCGTCTTCTGCTTG. So that would make it, what, a flow cell oligo?
Of note: these samples cut from an agarose gel. They were 600-750 bp. But upon denaturation and running on an Agilent pico RNA chip, modest amounts of smaller fragments are visible. (See figure above.) The inset has a zoomed-in view of the putative culprits. Of note is that they are not present at high concentrations compared to the main peak. But nevertheless, in two of the samples (1083 and 1084 in lanes 2 and 3), look to be consuming about 1/2 of the sequence being generated.
An earlier run had larger amounts of these and was nearly ruined by them. qPCR gave low estimates of their molar concentration (because of their low molecular weight) and so the lanes were tending towards over-clustering. Worse, the dimer predominated enough that it appeared to be interfering in cluster registration -- leading to low pass filter percentages.
Just an FYI, really. Primer dimers, once they show up during enrichment PCR can be the devil to get rid of. They can anneal to the main library amplicons and thwart double-stranded size selection.
--
Phillip
Comment