![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Insert Sizes for Paired End Reads Exactly the same as Read Length | rlowe | Bioinformatics | 0 | 06-27-2012 05:01 AM |
longest possible insert length & variable insert lengths | lcarey | Illumina/Solexa | 0 | 06-13-2012 12:05 AM |
About Insert, Insert size and MIRA mates.file | aarthi.talla | 454 Pyrosequencing | 1 | 08-01-2011 02:37 PM |
insert sizes | nozzer | Bioinformatics | 1 | 07-09-2010 06:49 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
I'm making Nextera libraries on mosquito genomic DNA using Illumina's Nextera kit. I ran the protocol exactly as stated and ran my resulting libraries on a Bioanalyzer HS DNA chip. My insert sizes are larger than I expected (~1000bp observed versus 300-500bp expected) [picture attached--apologies for the crude display of fragment size].
I spoke with Illumina's technical support and they suggested that this was due to: - Too much starting material - Tagmentation incubation too short. - Tagmentation incubation not warm enough. The Illumina protocol states that "libraries with an average size >1kb may require clustering at several concentrations to achieve optimal density," which suggests to me that this isn't a problem and that the libraries can still be sequenced with a bit of optimization. We're aligning back to a reference sequence so I don't think large insert sizes will be a problem from the bioinformatic perspective (in fact, they may help with mapping!). Is this something I should be concerned with? |
![]() |
![]() |
![]() |
#2 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
Here are the suggestions offered by tech support:
-Make sure that there's isn't any ethanol carryover from DNA extraction -Elute DNA in water. I used Qiagen AE buffer which is TE, and apparently there is concern that EDTA might interfere with enzyme activity. Interesting since the Epicentre protocol suggested using TE... -Check that thermocycler is operating at the correct temperature -Decrease amount of starting material -If we do decide to go ahead with the sequencing, clusters need to be generated at a lower density because large fragments don't cluster as well. I'm going to try making the libraries again with varying amounts of DNA eluted in water since we don't want to reduce the amount of sequence that we generate. Will update once I know how well the modifications work. |
![]() |
![]() |
![]() |
#3 |
Member
Location: Madison Join Date: Feb 2010
Posts: 18
|
![]()
We have seen the same thing. As previously suggested, reducing the amount of starting material to 25-30 ng helps. Also, try clustering with 6pM. That should result in about 500-600k clusters per mm2. That worked well for us with longer fragments.
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]()
Was the HS chip run on the sample before or after amplification? I presume before. (If after, then you may just be seeing the old double-peak phenomenon -- called "bird-nesting" by Epicentre.)
If you want to obtain sequence from the stuff around 1 kb, you would need to get rid of (size select) the shorter fragments. For reasons I don't comprehend, DNA above 1 kb just does not compete well against the shorter fragments during clustering. If you just run the library as-is there is a possibility that your results may be biased. That is the Nextera tagmentation transposase may have found pockets of the genome you are sequencing that it really likes, and others it does not. Hence the vaguely bimodal distribution that you see. My tendency would be to do a time series, collecting fractions at intervals. Then pool it all, run it on a gel and size select to something reasonable (400-600?). That way, if the transposase is biased you will get a mixture of all the genomic pockets it lands as the larger fragments become more tagmented in the later time fractions. Or, you could give it a shot and see if your are seeing high bias. -- Phillip |
![]() |
![]() |
![]() |
#5 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
The image is post-amplification and post-AMPure clean up. I ran the same sample pre-amplification, and the peak was at the same place. That suggests to me that it wasn't due to "bird-nesting," and tech support agreed.
Thanks for the very helpful replies! |
![]() |
![]() |
![]() |
#6 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
I tried re-making my Nextera libraries with two modifications.
1) I suspended my starting DNA in water instead of TE since there was some concern about EDTA interfering with tagmentation. 2) I tried reducing the amount of starting material (30ng vs 50ng). Sample 1=Tagmented DNA post-Zymo cleanup, 30ng starting material in H20 Sample 2=Tagmented DNA post-Zymo cleanup, 50ng starting material in H20 Sample 3=Final library, post-PCR cleanup, 30ng starting material in H20 Sample 4=Final library, post-PCR cleanup, 50ng starting material in H20 From this, it's clear that starting with DNA in TE or water gives exactly the same results since Sample 4 looks exactly like the libraries prepared in my first attempt. Reducing my amount of input DNA to 30ng did not seem to help since this led to a lower insert size than desired (Sample 3)! I guess I'll next try running my tagmentation for different lengths of time and at different temps and running the DNA on a chip. |
![]() |
![]() |
![]() |
#7 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
Anyway, a time course does sound like a good choice. -- Phillip |
|
![]() |
![]() |
![]() |
#8 | |
--Site Admin--
Location: SF Bay Area, CA, USA Join Date: Oct 2007
Posts: 1,358
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#9 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
Ah, good to know! I don't have too much experience with Bioanalyzers. Regardless, it doesn't seem that reducing the amount of starting material to 30ng solved my problem.
|
![]() |
![]() |
![]() |
#10 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
I tried incubating my samples on a heat block instead of in a PCR machine in case our machine is mis-calibrated, and I tried increasing the tagmentation step to 10 minutes. My size inserts still has a peak around 1kb.
I'm wondering if I'm losing my small fragment sizes during my Zymo clean up. I've been using the column Zymo kit instead of the plate kit since I'm processing a small number of samples. I've been using the spin speeds from the Zymo protocol, but I'm wondering if I need to reduce them. I also noticed that the Zymo protocol suggests a DNA binding buffer:sample ratio of 5:1 for DNA fragments, whereas the Illumina protocol uses 3.6:1. (Interestingly, the original Epicentre protocol used 5:1). Does anyone have a modified protocol using columns that they could share? |
![]() |
![]() |
![]() |
#11 |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]()
What QC do you do on your genomic DNA? Might be worth doing an RNAse treatment + Ampure (or whatever) clean-up on it.
rRNA would not be subject to tagmentation. Not sure what size it would run after degrading via the heat and divalent cations likely present in tagmentation and PCR. But I frequently see genomic DNA preps that are >90% RNA. (The record was >99.9% RNA, almost a pure RNA prep.) There is far more RNA than DNA in most cells. A lot of (especially old-school) genomic DNA preps ignore it. So, although a long shot, I thought I should mention it. -- Phillip -- Phillip |
![]() |
![]() |
![]() |
#12 |
Member
Location: East Bay Join Date: Jul 2012
Posts: 26
|
![]()
I have been having very similar problems hitting the cluster density sweet spot with Nextera libraries of yeast genomic DNA. HS chip electropherograms of samples after tagmentation or after PCR (when following the Nextera protocol without modification) has given size distributions peaking at 1 kb or higher with the cluster densities using a 10 pM load of such libraries in the 300-400 K/mm2 range. I modified the Nextera protocol three ways (20 ng starting gDNA, 8 PCR cycles with 1 min extension time) and got a poor yield of DNA with a skew to fragments that were too small (peak around 200 bp). I didn't run this last one on the MiSeq. I also tried using the Nextera XT kit and this gave better size distribution, but the cluster density was even lower. I will continue trying modifications to the Nextera protocol, but if anyone knows the secret to 800 K/mm2 and 2 Gb of data I would like to hear it. Thanks!
|
![]() |
![]() |
![]() |
#13 |
Junior Member
Location: California Join Date: Aug 2012
Posts: 2
|
![]()
I also have had problems with getting Nextera libraries of the right size distribution (insert sizes larger than 1kb). I use mouse and human DNA.
Having super clean DNA does help. So, instead of using ethanol precipitation to concentrate DNA, I now use the DNeasy Blood & Tissue Kit, but elute in only 30uL water. Varying the tagmentation time did not help. Using Qiagen Minelute columns seems to work fine (I don't use Zymo). I haven't tried using less than 50ng DNA, but I will in the future. I have also found that non-ideal libraries sequence just fine (size distribution from around 400bps to 1kb). |
![]() |
![]() |
![]() |
#14 | |
Member
Location: Europe Join Date: Sep 2012
Posts: 10
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#15 |
Member
Location: East Bay Join Date: Jul 2012
Posts: 26
|
![]()
This seems to be the best thread in which to post this info.
Since my earlier troubles with low cluster density posted in an earlier thread, the MiSeq has provided as much as 2.5 Gb of good data in a single run using Nextera libraries. The quality of the DNA is important, though I have not identified what impurities have the most affect on cluster density. One must be careful about how they deliver the 50 ng to the Nextera process. Pipetting small volumes from solutions of HMW DNA at high concentration transfers widely different amounts of DNA. Keep solutions below 100 ng/ul. The amount of data the MiSeq delivers varies quite a bit from run to run (500-2500) and I have only some clues about what factors contribute to this variability. I tried using the Bioanalyzer to QC the tagmentation reaction before doing PCR, but this is a huge hassle and I don't recommend it (the quality of this data is unpredictable). However, if you are multiplexing, you must use the Bioanalyzer to QC the libraries themselves (after PCR) so that you get approximately equal representation. It seems the more fragments there are in the size distribution that are <300bp or >1500bp, the lower the cluster density. The amount of smaller fragments depends mostly on the final Ampure cleanup step. You can reduce the amount of the larger fragments by decreasing the PCR extension time from 3 min to 1.5 min. I have also added a couple of extra cycles to the PCR, which increases the fraction of fragments bearing adapters and thus able to form clusters. I hope this is useful information. |
![]() |
![]() |
![]() |
#16 |
Junior Member
Location: California Join Date: Aug 2012
Posts: 2
|
![]()
I just heard that the original Nextera enzyme from Epicentre gave nice peaks, but Illumina's version is not as good. Not that this helps, but at least explains why the Nextera kit is more difficult than promised.
|
![]() |
![]() |
![]() |
#17 |
Member
Location: United Kingdom Join Date: Aug 2011
Posts: 12
|
![]()
In case anyone is interested, here's a quick update on the results from our sequencing:
-We sequenced two pools of libraries in two lanes of PE 100bp HiSeq, with one yielding 100x2 million reads and the other yielding 180x2 million reads. -I analyzed 1 lane using bwa sampe -a 2000 to allow insert sizes up to 2kb to be properly paired. From Picard's InsertSizeMetrics, the median insert size is 194bp (see attached). It seems to me that the clustering and/or sequencing step greatly biases towards recovery of the shorter fragments even though the Bioanalyzer is finding a peak size of ~1kb. We're very pleased with the results and will continue to use Nextera. |
![]() |
![]() |
![]() |
#18 | |
Senior Member
Location: Purdue University, West Lafayette, Indiana Join Date: Aug 2008
Posts: 2,317
|
![]() Quote:
-- Phillip |
|
![]() |
![]() |
![]() |
#19 |
Junior Member
Location: United Kingdom Join Date: May 2013
Posts: 2
|
![]()
I have just started doing Nextera DNA library preps and I am getting the same larger than expected peaks (1000-2000 bp) with some bimodality too. I am attaching a picture of the last 11 libraries I ran on the Bioanalyzer HS Chip - this is post-PCR, we did not run the libraries post-tagmentation (pre-PCR).
Extractions were done with the Qiagen Blood and Tissue kit and we included an RNAse treatment and used a buffer without EDTA (EB Buffer). I was wondering if I should try lowering the input material to 30 ng as a test (all libraries shown had a range from 41 to 51 ng, but there is no pattern that corresponds with this). Also wondered about trying a longer tagmentation step... but I suspect I will just get more smaller fragments, still with the large peak around 1000-2000 bp. I am wondering if the bimodality is just an insertion preference bias of the transposome, in which case I guess I can't do anything! Seems that Nextera is highly variable... Does any one with previous experience think that my libraries will still sequence ok (100 base paired-end sequence on the HiSeq), despite the large peak and some bimodality? How do you optimize the cluster density with bimodal distributions? |
![]() |
![]() |
![]() |
#20 |
Member
Location: East Bay Join Date: Jul 2012
Posts: 26
|
![]()
We have had very similar Bioanalyzer traces in the past, but now routinely get unimodal peaks with 400-1000 bp average size. Here are some things we believe are important for optimum results.
1) DNA must be accurately quantified and diluted so that exactly 50 ng is used in the tagmentation reaction. All dilutions of DNA should be done with Tris buffer containing 0.05% Tween 20. DNA at low concentrations can stick to the plasticware, while DNA (especially genomic) at high concentrations can give inaccurate pipetting because of the viscosity. Your variable Bioanalyzer traces indicate too much tagmentation due to a variable and inadequate amount of DNA used in the reactions. 2) Be wary of N501 and possibly other combinations of i7 and i5 bar-coded primers. Use the i7 indices with N505 for the most reliable results. You can order N505 from any oligo supplier and dilute it to 0.5 micromolar. 3) Increase the number of PCR cycles from five to eight and decrease the extension time from three to two min. 4) Be extra careful with the Ampure cleanup to avoid getting fragments less than 300 bp. We add 29 instead of 30 ul. The MW cut off is very sensitive to the ratio of beads to PCR reaction. 5) At least for genomic sequencing, we don't think fragments >1 kb in a Nextera library are a problem. However, do make sure they are included in the average size calculation, because this will significantly impact the concentration of that library in the pool. If anyone else has tips to add to this list, please do. We are still looking to optimize the process. We typically get cluster densities of ~1200 K/mm2, which appears to be close to the optimum, but by flying so close to the max, we occasionally overshoot and the MiSeq can't resolve the clusters. There are many parameters involved in hitting the sweet spot and we still don't have it under full control. |
![]() |
![]() |
![]() |
Tags |
bioanalyzer, nextera |
Thread Tools | |
|
|