SEQanswers

Go Back   SEQanswers > Applications Forums > Sample Prep / Library Generation



Similar Threads
Thread Thread Starter Forum Replies Last Post
ChIP- Different DNA size for Input DNA and ChIP DNA on Agilent Bioanalyzer rahulr1485 Epigenetics 19 05-17-2013 08:14 PM
question on DNA fragment size after shear the DNA (in Chip-seq) kaixinsjtu Sample Prep / Library Generation 4 04-05-2012 03:36 AM
PubMed: Optical Recognition of Converted DNA Nucleotides for Single-Molecule DNA Sequ Newsbot! Literature Watch 0 05-13-2010 02:00 AM
PubMed: Perspectives of DNA microarray and next-generation DNA sequencing technologie Newsbot! Literature Watch 0 01-20-2009 05:01 AM
PubMed: Profiling DNA methylation from small amounts of genomic DNA starting material Newsbot! Literature Watch 0 11-07-2008 09:30 AM

Reply
 
Thread Tools
Old 10-27-2009, 05:31 AM   #1
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default Where does all the DNA go?

The standard Roche protocol for shotgun library construction asks for 10 ug of input DNA to yield a few million templated beads for sequencing. Rule of thumb: 1 ug of 1 kb double stranded DNA is 1 trillion (1E+12) molecules[1].

Get that? 10 trillion molecules to start with so that I can sequence less than 10 million of them. What happened to the other 9,999,990,000,000 molecules?

Not really fair to the Roche protocol? Usually one ends up with enough library to sequence more than 10 million bead? Plug your own numbers in. My guess is the molecular yield from this technique will be no better than 0.1%.

I do not mean to single out Roche here, I think protocols for all instrument systems are looking at fractions of a percent molecular yield. As long as one has plenty of DNA, maybe it does not matter. But sometimes DNA (or RNA) is limiting, no?

And what if there is bias in the loss process? Most of us sweat adding a few more cycles of PCR into our library prep procedure because we know PCR can bias our results. But I have never met a single person who worried that the 99.9% (add as many nines as you care to) of DNA molecules being lost during library construction might have a sequence-composition biased component to their loss.

If I get any response (other than a blank stare) from those designing these protocols about the molecular yield, it usually that the yields in each step are not 100%. The implication, I presume, is that these yield losses are multiplicative. Fair enough, how many steps with 50% yield do I need to lose 99.9% of my DNA? That would be ten steps.

I do not think most library construction steps have yields as low as 50%. Instead, I think it more likely that:

(A) A few steps have extremely low molecular yields and

(B) The protocols we are using rely on our being able to visualize the molecules and their size distribution for purposes of quality control.

I am going to ignore (B) for the purposes of the rest of this post.

As for (A), most of the methodologies I see being developed for low amounts of starting material are focused on amplification. Might be worth taking a look at where DNA (or RNA) is being lost and tightening that up. A couple of places to look would be % of ends successfully repaired after mechanical fragmentation of DNA and chemical DNA damage. The latter may be a non-issue or not. But think about it, how often do you worry about the redox-state of your DNA? How about UV damage from the sunlight streaming in through your lab windows?

Might 90% of the molecules in a typical DNA prep be impossible to replicate without repair beyond the end repair we normally deploy? Could that number be 99% or 99.9%? Real question. I would like to know.

--
Phillip

(Notes)
1. Okay, yeah, using some standard numbers, like 650 MW for a base pair, the number is really 926 billion molecules, not 1 trillion. But nothing I discuss here would be sensitive to less than 10% tolerances, so the difference is safe to ignore...
pmiguel is offline   Reply With Quote
Old 10-27-2009, 07:35 AM   #2
Joann
Senior Member
 
Location: Woodbridge CT

Join Date: Oct 2008
Posts: 231
Default genome?

If that's the methodology, my question is

So.....what genome ends up getting sequenced?
Joann is offline   Reply With Quote
Old 10-27-2009, 08:00 AM   #3
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by Joann View Post
If that's the methodology, my question is

So.....what genome ends up getting sequenced?
Not sure I follow you.
pmiguel is offline   Reply With Quote
Old 10-27-2009, 08:33 PM   #4
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

Check out these papers: each would suggest that the "I need enough DNA to visualize" angle is why so much input DNA; you apparently can get by with very little DNA if you have a better way to track it & quantitate it.

Anyone here routinely using these protocols or similar ones? How do they really behave?

http://www.biomedcentral.com/1471-2164/10/116
BMC Genomics. 2009 Mar 19;10:116.
Digital PCR provides sensitive and absolute calibration for high throughput sequencing.
White RA 3rd, Blainey PC, Fan HC, Quake SR.

Department of Bioengineering at Stanford University and Howard Hughes Medical Institute, Stanford, CA 94305, USA. raw937@sbcglobal.net
BACKGROUND: Next-generation DNA sequencing on the 454, Solexa, and SOLiD platforms requires absolute calibration of the number of molecules to be sequenced. This requirement has two unfavorable consequences. First, large amounts of sample-typically micrograms-are needed for library preparation, thereby limiting the scope of samples which can be sequenced. For many applications, including metagenomics and the sequencing of ancient, forensic, and clinical samples, the quantity of input DNA can be critically limiting. Second, each library requires a titration sequencing run, thereby increasing the cost and lowering the throughput of sequencing. RESULTS: We demonstrate the use of digital PCR to accurately quantify 454 and Solexa sequencing libraries, enabling the preparation of sequencing libraries from nanogram quantities of input material while eliminating costly and time-consuming titration runs of the sequencer. We successfully sequenced low-nanogram scale bacterial and mammalian DNA samples on the 454 FLX and Solexa DNA sequencing platforms. This study is the first to definitively demonstrate the successful sequencing of picogram quantities of input DNA on the 454 platform, reducing the sample requirement more than 1000-fold without pre-amplification and the associated bias and reduction in library depth. CONCLUSION: The digital PCR assay allows absolute quantification of sequencing libraries, eliminates uncertainties associated with the construction and application of standard curves to PCR-based quantification, and with a coefficient of variation close to 10%, is sufficiently precise to enable direct sequencing without titration runs.

PMID: 19298667 [PubMed - indexed for MEDLINE]

http://nar.oxfordjournals.org/cgi/pm...&pmid=18084031
Nucleic Acids Res. 2008 Jan;36(1):e5. Epub 2007 Dec 15.
From micrograms to picograms: quantitative PCR reduces the material demands of high-throughput sequencing.
Meyer M, Briggs AW, Maricic T, Höber B, Höffner B, Krause J, Weihmann A, Pääbo S, Hofreiter M.

Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany. mmeyer@eva.mpg.de
Current efforts to recover the Neandertal and mammoth genomes by 454 DNA sequencing demonstrate the sensitivity of this technology. However, routine 454 sequencing applications still require microgram quantities of initial material. This is due to a lack of effective methods for quantifying 454 sequencing libraries, necessitating expensive and labour-intensive procedures when sequencing ancient DNA and other poor DNA samples. Here we report a 454 sequencing library quantification method based on quantitative PCR that effectively eliminates these limitations. We estimated both the molecule numbers and the fragment size distributions in sequencing libraries derived from Neandertal DNA extracts, SAGE ditags and bonobo genomic DNA, obtaining optimal sequencing yields without performing any titration runs. Using this method, 454 sequencing can routinely be performed from as little as 50 pg of initial material without titration runs, thereby drastically reducing costs while increasing the scope of sample throughput and protocol development on the 454 platform. The method should also apply to Illumina/Solexa and ABI/SOLiD sequencing, and should therefore help to widen the accessibility of all three platforms.
krobison is offline   Reply With Quote
Old 10-28-2009, 04:19 AM   #5
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Yes, I de-emphasized this point in my original post, because it has received some attention and methods have been developed to address part of this particular issue. (Doing QC on the size distribution of a library you cannot see on a gel or lab chip would still be tricky.)

Note, however that both papers appear to suffer from the same dismal molecular yields of "input DNA" to "library molecules".

In the Meyer's paper, the Bonobo sample (table 2) starts with 500 ng and a mean fragment size of 500 bases. Using the "1 ug of 1kb DNA is about 1 trillion molecules" rule of thumb I suggested earlier -- that equals 1 trillion 500 base molecules (double stranded). Meyers succeeds in isolating 50,000 beads after enrichment from 1 trillion molecules he started with.

Molecular yield: 5E+04/1E+12 = 5E-08
that is, 0.000005%

That yield is a 3x overestimate if you only count sequence-pass reads generated.

Similarly, if you look at "additional file 2" in the White paper, the lowest input DNA amount used in a shotgun library is 0.7 ug of 550 bp mean size. Again over 1 trillion molecules to start with. This yielded 7E+05 to 1E+06 ssDNA library molecules (depending on quantitation method). That is an excellent molecular yield: 0.0001%. I still would like to know where most of the 99.9999% of the molecules went, though..

But both papers show that trillions of library molecules are not necessary to get a good emulsion PCR. That is pretty well accepted these days.

White et al. http://www.biomedcentral.com/1471-2164/10/116 do at least point in the direction of the 500 pound gorilla:

Quote:
It is natural to expect that library preparation protocols developed with the capacity to handle up to five micrograms of input are far from optimal with respect to minimizing loss from nanogram or picogram samples. A procedure optimized for trace samples with reduced reaction volumes and media quantities, possibly formatted in a microfluidic chip, has the potential to dramatically improve the recovery of library molecules, allowing preparation of sequencing libraries from quantities of sample comparable to that actually required for the sequencing run, e.g. close to or less than one picogram.
--
Phillip
pmiguel is offline   Reply With Quote
Old 10-28-2009, 12:06 PM   #6
What_Da_Seq
Member
 
Location: RTP

Join Date: Jul 2008
Posts: 28
Default

How many single/di/trinucleotides are being generated by the fragmentation process - invisible to visualization and percent DNA lost in the fragment sizing step. Philip I get your line of inquiry and I am wondering if single molecule sequencing improves on the abysmal efficiency.
My 1.25 cent
What_Da_Seq is offline   Reply With Quote
Old 10-28-2009, 12:42 PM   #7
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by What_Da_Seq View Post
How many single/di/trinucleotides are being generated by the fragmentation process - invisible to visualization and percent DNA lost in the fragment sizing step. Philip I get your line of inquiry and I am wondering if single molecule sequencing improves on the abysmal efficiency.
My 1.25 cent
Depends on the method. Nebulization/Hydroshear probably produces very few oligomers. But size selection will drastically reduce the amount of DNA. Still this usually is not much more than 90% loss.

My guess is that the majority of the subsequent loss is a result of (1)Unrepairable ends and (2)DNA damage that prevents DNA replication.

--
Phillip
pmiguel is offline   Reply With Quote
Old 10-30-2009, 01:42 AM   #8
McTomo
Junior Member
 
Location: Germany

Join Date: Jan 2008
Posts: 3
Default

In the figure 2 of this paper you can see where does you DNA go and the efficiency of each step in the 454 library preparation process:
http://www.biotechniques.com/Biotech...ues-92046.html
There seem to be the biggest loss at the NaOH melting step
McTomo is offline   Reply With Quote
Old 10-31-2009, 11:02 AM   #9
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by McTomo View Post
In the figure 2 of this paper you can see where does you DNA go and the efficiency of each step in the 454 library preparation process:
http://www.biotechniques.com/Biotech...ues-92046.html
There seem to be the biggest loss at the NaOH melting step
Thanks, that is very interesting. I knew about the highly variable and generally extremely low yields from the library immobilization/ssDNA elution. Bruce Roe's lab, for example, discards that step altogether. But the Maricic and Paabo method does make it seem much more attractive.

A couple of notes. This paper only deals with post adaptor ligation DNA loss, because it starts with a PCR product/library molecule. Also, even the 99% potential loss of this step only explains 2 of the >6 orders of magnitude of DNA loss in the Roche protocol.

As I've mentioned before I think most of the rest is probably the result of un-repairable ends and DNA damage that has rendered a given strand un-replicatable.

--
Phillip
pmiguel is offline   Reply With Quote
Old 11-02-2009, 02:51 AM   #10
McTomo
Junior Member
 
Location: Germany

Join Date: Jan 2008
Posts: 3
Default

Quote:
Originally Posted by pmiguel View Post
A couple of notes. This paper only deals with post adaptor ligation DNA loss, because it starts with a PCR product/library molecule. Also, even the 99% potential loss of this step only explains 2 of the >6 orders of magnitude of DNA loss in the Roche protocol.
If you multiply the losses in each step of the library preparation process (starting at the blunting), you come to the ~10% of the starting DNA ending up in the 454 library. Even though the PCR product was used, it has to be repaired: overhanging A's have to be removed and the phosphates have to be added. However, I agree that there might to be other types of damage that appear in the sheared genomic DNA that can't be repaired.
McTomo is offline   Reply With Quote
Old 11-02-2009, 03:35 AM   #11
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by McTomo View Post
If you multiply the losses in each step of the library preparation process (starting at the blunting), you come to the ~10% of the starting DNA ending up in the 454 library. Even though the PCR product was used, it has to be repaired: overhanging A's have to be removed and the phosphates have to be added. However, I agree that there might to be other types of damage that appear in the sheared genomic DNA that can't be repaired.
Yes, my guess is that non-enzymatic fragmentation methods produce some ends that cannot be repaired by the typical T4-polymerase/T4-PNK. I posted my speculation on this topic, based largely on a very old paper:

http://seqanswers.com/forums/showthread.php?t=2759


The upshot was that sonication predominantly broke C-O bonds. While these C-O breaks may proceed through solvolysis to C-OH ends, other outcomes are conceivable. Unclear what ends nebulization/hydroshearing produce.

While an unrepairable end, on either end, of a DNA fragment will prevent creation of a library amplicon from that fragment, there are other issues to consider. DNA damage may prevent replication of a DNA strand. How damaged is the typical DNA prep? I'm sure this has been considered in the literature. But a PCR reaction, lacking the support of a cellular environment, would be much more susceptible to chain-terminating DNA damage than an in vivo assay would detect.

I think this is why the SOLiD protocols invariably utilize a pre-ePCR, PCR step. That way amplifiable library molecules will predominate in a sample and assays of that pre-amplified sample will more accurately predict that sample's behavior in ePCR.

--
Phillip
pmiguel is offline   Reply With Quote
Old 11-03-2009, 03:55 PM   #12
happy
Junior Member
 
Location: Boston

Join Date: Oct 2009
Posts: 1
Default

How many genomes (human haploid) are in a ug of DNA?
happy is offline   Reply With Quote
Old 11-04-2009, 03:38 AM   #13
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by happy View Post
How many genomes (human haploid) are in a ug of DNA?
How about the chicken haploid genome instead? That is 1 billion bp.

If 1 ug of 1 thousand bp fragments is 1 trillion molecules, that is the same as saying that a quadrillion bp genome (1 thousand x 1 trillion = 1E+03 x 1E+12 = 1E+15 = 1 quadrillion) is 1 ug.

So:

Code:
genome             genome
size (bp) 	   mass
-------------------------------
1 quadrillion 	   1 ug
1 trillion 	   1 ng
1 billion 	   1 pg
1 million          1 fg
So a haploid chicken genome is 1 pg. 1 million haploid chicken genomes are in a ug of chicken DNA.

That means a haploid human genome is 3 pg. So 1 ug of human DNA is roughly 333,333 human genomes.

--
Phillip
pmiguel is offline   Reply With Quote
Old 11-17-2009, 05:50 PM   #14
Nitrogen-DNE-sulfer
Member
 
Location: In CPT code hell

Join Date: Aug 2009
Posts: 20
Default

Great Thread,

Stepping back a bit to dissect this we have been testing how far we can go with simpler Fragment libraries as a first measure of this. Most Circularized protocols have many inefficient steps and we began by quantitating how much DNA we can get from just fragmenting DNA and adapting it and then counting distinct molecules on the back end we have gone as low as 750pg of already sheared DNA to generate 30-40M distinct 50mer human reads. I think this is a very key point. This is roughly 300 copies of the genome but most importantly, we didnt covaris this DNA. It came from Maternal blood streams so it was enzymaticaly digested in situ or in vivo. It also has a very different GC content than Covaris DNA not surprisingly.
The reason I find this is intriguing is that all methods eventually go through a final Frag adaptor ligation so its important to know the efficiency of this step and its after all the simplest to measure. We will be backing up into the various circularization protocols shortly but already know the SOLiD circles are 10-20% efficient at the lengths mentioned above.

In terms of Covaris'd DNA, I will look through our data but we have performed 600M read on 1 ug buccal DNA Covaris'd from a patient and not saturated this library. We probably need to go deeper to understand if the different shearing methods are playing a damaging effect.

I found the complete genomic paper fairly well written in regards to exact pmols at each step. Lots of amplification along the way but its clear we need protocols which speak to these quants at every step with the other platforms as well.

The final point I'd add to the discussion is that not all quantified DNA is amplifiable or makes it to a bead or to a cluster. We're working with emPCR on SOLiD and we assume a 1/2 to 2/3rds of our reactors have DNA and no beads. We lean on the pushing the bead poisson high and the template poisson low as 2 beads in a reactor dont kill us but 2 templates do.

Similar effects may exist on the poisson curves for clusters...ie Flow cells must be flooded with 1 concentration where only a portion of this concentration can seed the flow cell surface but molecules exist throughout the whole volume and I'm still unclear if both surfaces amplify and only one being imaged creates another factor of 2 loss?
Nitrogen-DNE-sulfer is offline   Reply With Quote
Old 11-18-2009, 04:42 AM   #15
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
Great Thread,

[...]
from just fragmenting DNA and adapting it and then counting distinct molecules on the back end we have gone as low as 750pg of already sheared DNA to generate 30-40M distinct 50mer human reads. I think this is a very key point. This is roughly 300 copies of the genome but most importantly, we didnt covaris this DNA. It came from Maternal blood streams so it was enzymaticaly digested in situ or in vivo.
[...]
(30-40M unique starts, right? Two reads that map to the same start position in a genome may (or may not) derive from a single input DNA molecule. That is why using single-sided reads makes this process difficult to assess.)

40M reads implies 40M 80-130 bp insert amplicons. What yield does that represent?

1 pg of 100 bp DNA is roughly 10 million molecules. So you started with 7.5 billion molecules (presuming they were all ~100 bp -- which, obviously, they would not have been). That would imply that roughly 1 in 200 of the original molecules were successfully converted to templated beads.

Was this DNA already size selected when you measured it as 750 pg?

I would expect any smear of DNA to be <10% in the fairly tight size distribution (150-200 bp for the full amplicon length == 80-130 bp of insert). Even as low as 1% would not be surprising.

So, yes that result would be consistent with nearly all the molecules being ligatable on both ends and amplifiable. Or less than 10% of them being so. Hard to say.

Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
The reason I find this is intriguing is that all methods eventually go through a final Frag adaptor ligation so its important to know the efficiency of this step and its after all the simplest to measure.
By qPCR? By sequencing, it is not so simple. The human genome is replete with repetitive DNA, so single end reads are difficult to assess as to their derivation from a unique chunk of your original sample DNA. This is because the pre-PCR amplification step would make lots of copies of all amplifiable amplicons.


Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
In terms of Covaris'd DNA, I will look through our data but we have performed 600M read on 1 ug buccal DNA Covaris'd from a patient and not saturated this library. We probably need to go deeper to understand if the different shearing methods are playing a damaging effect.
Yes, especially since minor changes in the shearing buffer may lead to different outcomes. I'm not a chemist, but if the C-O bond breakage that apparently predominates in sonication-mediated DNA fragmentation

http://seqanswers.com/forums/showthread.php?t=2759

can result in different fragment ends, then factors such as pH may influence which end-type does result. That is, a break between the C5' and O or C3' and O may result in the desired outcome: hydrolytic restoration of the end to a 5' or 3' OH. Or it could result in undesired outcomes such as ribose-sugar ring opening or maybe even loss of C5' entirely. (Again, I'm not a chemist, the above is rampant speculation.) Point being, mixtures of T4-polymerase/T4-PNK probably cannot repair the latter outcomes into something ligatable.

Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
[...]
The final point I'd add to the discussion is that not all quantified DNA is amplifiable or makes it to a bead or to a cluster. We're working with emPCR on SOLiD and we assume a 1/2 to 2/3rds of our reactors have DNA and no beads. We lean on the pushing the bead poisson high and the template poisson low as 2 beads in a reactor dont kill us but 2 templates do.
Then it does not seem you would lose many amplicons in ePCR. That is, bead poisson high: therefore nearly all reactors have beads. Template poisson low -- most of the reactors have a bead but no template, but where there is a template it will almost certainly have a bead to bind it.

Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
Similar effects may exist on the poisson curves for clusters...ie Flow cells must be flooded with 1 concentration where only a portion of this concentration can seed the flow cell surface but molecules exist throughout the whole volume and I'm still unclear if both surfaces amplify and only one being imaged creates another factor of 2 loss?
I don't have a Solexa, so I don't know. But I will note, tangentially, that it is interesting that after a couple of years of direct competition between Solexa and SOLiD, it now appears that the two platforms are veering into slightly different niches. Solexa, with paired-end 100 base reads seems poised to conquer the de novo sequencing niche. Whereas SOLiD appears to have abandoned longer reads to concentrate on increasing read numbers. Which, everything else being equal, would give them control of the resequencing niche (including digital gene expression). That said, everything else is not equal. Illumina had instruments out in the field at least a full year before AB did. And then there is the PacBio instrument looming...

--
Phillip
pmiguel is offline   Reply With Quote
Old 11-18-2009, 06:24 AM   #16
Nitrogen-DNE-sulfer
Member
 
Location: In CPT code hell

Join Date: Aug 2009
Posts: 20
Default

Yes, Pairs are always better to measure start points and distinct molecules but with single ended reads we just assume if any single replicated start point is identical it was PCR induced. This is conservative and we only count unique placements in the genome..yet again an underestimate. The above example was 5 cycles of PCR with long extensions.
We have also performed no amplification libraries which eliminate the PCR replication problem. Taqman is another angle we use to measure at the linker step and I'll have to check our notes but I believe we are getting 300M-500M positive beads with 1ng of 200bp library. ABout 15% of our total beads (3B total) amplify which suggest few have 2 molecules in the bubble based on poisson. Not far off from your estimate of 5M molecules per 1pg of 200bp library or 5B per ng.

In our case the DNA was not size selected so we cant blame the gel extraction or Ampure.

We spent some time looking into peroxide formation during the shearing step and monitoring heat. Curious if any one has used the NEB preCR kit or their Fpg and Hogg1 repair enzymes to repair other forms of DNA damage like 8-oxoG or glycosic bond breaks which may be induced by this method.
Alternatively, DNAse based forms of digesting DNA may be less caustic?

Despite leaning high on beads in the emPCR this currently cant get a bead in every reactor without alot of clumping. Probably only 20-25% of the reactors populated. Raindance like techniques have been contemplated but would take days to make Billions of reactors.
Nitrogen-DNE-sulfer is offline   Reply With Quote
Old 11-18-2009, 10:52 AM   #17
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
In our case the DNA was not size selected so we cant blame the gel extraction or Ampure.
Not size selected at all? You must have size selected at some point, right? Otherwise you would get slammed with primer dimers, etc.

If you had size-selected prior to going into adaptor ligation, then all the DNA has the potential to form a legitimate amplicon. Whereas if not, a lot of that 750 pg would really be contributed by fragments outside a useful size range.

Quote:
Originally Posted by Nitrogen-DNE-sulfer View Post
We spent some time looking into peroxide formation during the shearing step and monitoring heat. Curious if any one has used the NEB preCR kit or their Fpg and Hogg1 repair enzymes to repair other forms of DNA damage like 8-oxoG or glycosic bond breaks which may be induced by this method.
Alternatively, DNAse based forms of digesting DNA may be less caustic?
I have not. But I also have no clear idea how much DNA damage is present in a typical genomic DNA prep. If, for example, 90% of DNA spans longer than a few hundred bases contained lesions bad enough to stall out PCR replication, would we even notice?

PCR is an exponential process, after all. The 90% of strands that could not be extended would not contribute as template to later cycles. So the 10% that did extend far enough for the reverse primer to anneal would quickly overtake those that stalled.

Again, even if DNA damage isn't that bad in most DNA preps, it could be that a DNA prep you happened to have out near a window happened to pick up some pyrimidine dimers from the sunlight streaming in. Who knows. Heretofore all the assays I can think of would be insensitive to even fairly high levels of damage. If even 1% of the 1kb stretches of DNA in a prep are damage-free that still gives you 10 billion intact 1 kb stretches per ug of DNA. If the 10 billion work, then the 990 billion that do not will not be noticed.

We may just be entering an era where we do need all 1 trillion molecules. If so we need to either make sure this type of damage is not an issue or find ways to mitigate the damage.
--
Phillip
pmiguel is offline   Reply With Quote
Old 11-18-2009, 11:22 AM   #18
What_Da_Seq
Member
 
Location: RTP

Join Date: Jul 2008
Posts: 28
Default

You are right IF damage occurs randomly. IF NOT all the cool events happen in the 90% that you are loosing
What_Da_Seq is offline   Reply With Quote
Old 11-18-2009, 11:30 AM   #19
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by What_Da_Seq View Post
You are right IF damage occurs randomly. IF NOT all the cool events happen in the 90% that you are loosing
There you go. Any time you lose 90% of a sample (for whatever reason), the remnant may be a biased representation of your initial sample. Because there is no reason to presume the loss is unbiased.

In the case of many of these library construction protocols we lose more like 99.9999% of our initial sample.
pmiguel is offline   Reply With Quote
Old 11-23-2009, 09:32 AM   #20
seqAll
Member
 
Location: China

Join Date: Nov 2009
Posts: 21
Default

Quote:
Originally Posted by pmiguel View Post
...Also, even the 99% potential loss of this step only explains 2 of the >6 orders of magnitude of DNA loss in the Roche protocol...
--
Phillip
Just wonder how precise the number 99% is. It was said >99%. But maybe 98.5%, 99.9%, 99.99, ...? So, that explains maybe 2, 3, 4...(unlikey to be 6 though) orders?

Loss or damage in other steps would be interesting to know.

Last edited by seqAll; 11-23-2009 at 10:19 AM.
seqAll is offline   Reply With Quote
Reply

Tags
library construction, yield

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:52 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO