SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Illumina Nextra prep without using Illumina reagents crsimao Illumina/Solexa 4 04-14-2015 10:29 AM
Comparison between SOLiD, Illumina MiSeq and Illumina HiSeq NGS_New_User SOLiD 0 12-12-2012 11:37 AM
bowtie command line for Illumina Hiseq 2000 with Illumina 1.5+ quality encoding files rworthi Illumina/Solexa 4 09-28-2011 11:25 AM

Reply
 
Thread Tools
Old 07-24-2017, 10:28 AM   #101
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,668
Default

You have 4.5 billion reads, and expect to detect contamination from 11% of the data (0.5B/4B) at a 90%-100% rate (alignment sensitivity) by observing 89% of data volume (4B/4.5B). So you should expect to detect .11*.89*(.9 to 1) = 8.8% to 9.8% of the total contamination. So, 2000 PPM observed would suggest 20400 PPM to 22700 PPM of actual cross-contamination, with a sufficiently high degree of multiplexing.

Bear in mind, though, that mouse contamination can come from other sources, and different index pairs have different rates of cross-contamination.
Brian Bushnell is offline   Reply With Quote
Old 07-24-2017, 10:59 AM   #102
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,218
Default

I forgot to mention -- IDT has Illumina Unique Dual Indexes -- a set of 96 adapters for sale. Once we have those we can split an S2 run 96 ways an be able to detect index swaps.

What are the HiSeq 3000/4000 instrument users doing? Kind of horrifying if upwards of 2% of reads have been mis-assigned since that instrument started being used.

--
Phillip
pmiguel is offline   Reply With Quote
Old 07-24-2017, 11:22 AM   #103
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,218
Default

Quote:
Originally Posted by Brian Bushnell View Post
You have 4.5 billion reads, and expect to detect contamination from 11% of the data (0.5B/4B)
Yeah, sounds reasonable. But I guess there is still the question of whether the index hop derives from a characteristic of the donor library, the recipient library or both? Illumina is saying that the index donor library definitely plays a role when said library includes unincorporated adapters and/or adapter dimers.

This seems like a really high rate of recombination, no? Do you detect an increase in chimeric inserts? Depending on the mechanism of recombination you stipulate, there might be recombination events at any stretch of similar sequence, not just in the adapters.

Quote:
Originally Posted by Brian Bushnell View Post
at a 90%-100% rate (alignment sensitivity) by observing 89% of data volume (4B/4.5B). So you should expect to detect .11*.89*(.9 to 1) = 8.8% to 9.8% of the total contamination. So, 2000 PPM observed would suggest 20400 PPM to 22700 PPM of actual cross-contamination, with a sufficiently high degree of multiplexing.

Bear in mind, though, that mouse contamination can come from other sources, and different index pairs have different rates of cross-contamination.
These were run as single indexes. But there may be different rates, yes.

I checked the HiSeq run for these environmental samples and we detected 0/1000 reads mouse hits for all 21 of the data sets.

--
Phillip
pmiguel is offline   Reply With Quote
Old 07-24-2017, 12:28 PM   #104
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,668
Default

Quote:
Originally Posted by pmiguel View Post
But I guess there is still the question of whether the index hop derives from a characteristic of the donor library, the recipient library or both?
We have several theories for what was driving this on HiSeq... the most plausible being something like, "library A had too many unincorporated adapters, library B had too many adapter-free inserts, and after mixing them, library B adopted some of the free adapters from library A". Which would indicate that it involves both the donor and recipient library. But I'm not sure if that mechanism is important for NovaSeq.

Quote:
This seems like a really high rate of recombination, no?
Well, it's higher than what I observed for single-index libraries on our NovaSeq, but not by a huge amount.

Quote:
Do you detect an increase in chimeric inserts? Depending on the mechanism of recombination you stipulate, there might be recombination events at any stretch of similar sequence, not just in the adapters.
I have not examined this on the NovaSeq yet, but I saw a much higher (several fold increase) of chimeric pairs when examining problematic reads on HiSeq. I don't remember the exact details; it might have been that reads mapped as improper pairs had a much higher rate of invalid barcode combinations, or vice-versa.[/QUOTE]
Brian Bushnell is offline   Reply With Quote
Old 07-26-2017, 12:49 PM   #105
cement_head
Senior Member
 
Location: Oxford, Ohio

Join Date: Mar 2012
Posts: 185
Default

Quote:
Originally Posted by pmiguel View Post
The recommended method to detect an index swap is to use "Unique Dual Indexes". With these you don't use the same i7 index in multiple pairs. A given i7 index always goes with a fixed i5 index for the run. Then if you detect an i7 index with any i5 index other than its pair, you know an index hop has occurred and the reads are discarded.

This will remove all index hops the result of a single recombination event. It will also remove nearly all the double recombinations. So true index hops should be largely detectable.

As to what causes index hopping, I don't think that Illumina is sure. They seem mainly to have a list of "best practices" to use to lower their frequency.

I haven't looked in detail at the process of exclusion amplification either. But I presume that it involves some non-flowcell-tethered PCR amplification.

--
Phillip
Okay, this is interesting and jives with their basic premise. This is also contradictory to their NEXTERA i5/i7 design wherein Index codes are re-used multiple times.
cement_head is offline   Reply With Quote
Old 07-26-2017, 12:55 PM   #106
cement_head
Senior Member
 
Location: Oxford, Ohio

Join Date: Mar 2012
Posts: 185
Talking

Quote:
Originally Posted by Brian Bushnell View Post
This is kind of tangential to NovaSeq, but...

I've suggested that we keep everything on ice whenever possible prior to sequencing, due to the fact that low temperatures retard any kind of activity and thus should inhibit adapter-swapping (which is a huge problem as we run a lot of highly-amplified single cells). But my explanations were too vague to be taken seriously, since I don't know the specifics of the reactions. I would love to have a very clear (and preferably lengthy, rather than concise) explanation of exactly why and when keeping pools on ice should prevent crosstalk, that I can copy and paste (attributing credit, if desired) to the people in charge of making libraries.

I think it is obvious that the longer you let a mixed batch of libraries sit around, and the higher the temperature, the more index-swapping will occur, regardless of the mechanism. But without citing a specific mechanism (and it does not really matter if it is the dominant one), nobody involved with library prep will pay attention to my concerns on the issue (meaning, no tests of ice vs no ice). All I really need is a real mechanism, which seems sufficiently important to cause a test to be run; once that occurs, I'll be satisfied, even if the results are negative and indicate that keeping pooled libraries at a high temperature for a long time seems to be optimal for preventing crosstalk. Not that I'll believe negative results unless I run the experiment myself, but at least I'll believe I did my best. I'll still report the results here.
This whole problem is starting to make more and more sense to me now. Just enough sloppiness at each step probably contributes to a perfect storm of IH (Index Hopping). And given that the MiSeq/HiSeq2500 system wasn't as sensitive to these issues, it is believable that we've all picked up bad habits.
cement_head is offline   Reply With Quote
Old 07-27-2017, 01:17 AM   #107
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,035
Default

Index hopping is the result of annealed oligo extension by ExAmp. I do not know the details of ExAmp but KAPA HiFi polymerase under stringent cycling condition is able to extend primers as long as the 3’ base and other 6 bases in the 10 base region of 3’ is complementary even though the rest of the oligo is not a match and just hangs off the template.

Left over adapter oligos, PCR primers, single-adapted and non-adapted fragments can act as oligo and result in index hopping, neutral and cluster forming fusion fragments, respectively. So presence of high concentration of oligos acting as primers and longer incubation of library pool will increase these artifacts. I also would expect to see more fusion with PCR-free libraries as the proportion of fragments without adapters in both end are higher in comparison to PCR amplified libraries.
nucacidhunter is offline   Reply With Quote
Old 08-01-2017, 06:19 AM   #108
GW_OK
Senior Member
 
Location: Oklahoma

Join Date: Sep 2009
Posts: 383
Default

There've been a few hypotheses that ExAmp is actually Recombinase Polymerase Amplification (RPA), developed by TwistDX.

Here's a Youtube video describing it

It makes sense to me. And is semi-described in one of Illumina's patents that James Hadfield reviewed on his blog.
GW_OK is offline   Reply With Quote
Old 08-01-2017, 01:21 PM   #109
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,218
Default

Quote:
Originally Posted by Brian Bushnell View Post
We have several theories for what was driving this on HiSeq... the most plausible being something like, "library A had too many unincorporated adapters,
Yes, is likely to be an issue.
Quote:
Originally Posted by Brian Bushnell View Post
library B had too many adapter-free inserts, and after mixing them, library B adopted some of the free adapters from library A".
No, adapter-free inserts will not be joined with unicorporated adapters without the intervention of a ligase.

Remember, DNA can be converted back and forth from single-stranded to double-stranded without the intervention of any enzyme if the right temperature/salt/concentration is present. The hydrogen-bond-guided interactions between the bases of reverse-complementary strands of DNA are reversible under these conditions.

The process of breaking the phophodiester/ribose backbone requires much more energy. Joining DNA strands via their backbone pretty much requires an enzyme.
Quote:
Originally Posted by Brian Bushnell View Post
Which would indicate that it involves both the donor and recipient library. But I'm not sure if that mechanism is important for NovaSeq.
Probably the same. Seems like the only major difference is that you don't have to add the Ex-Amp glop to your denatured sample when using the NovaSeq. That happens in the instrument.

--
Phillip
pmiguel is offline   Reply With Quote
Old 08-01-2017, 01:35 PM   #110
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,218
Default

Quote:
Originally Posted by nucacidhunter View Post
Index hopping is the result of annealed oligo extension by ExAmp. I do not know the details of ExAmp but KAPA HiFi polymerase under stringent cycling condition is able to extend primers as long as the 3’ base and other 6 bases in the 10 base region of 3’ is complementary even though the rest of the oligo is not a match and just hangs off the template.

Left over adapter oligos, PCR primers, single-adapted and non-adapted fragments can act as oligo and result in index hopping, neutral and cluster forming fusion fragments, respectively. So presence of high concentration of oligos acting as primers and longer incubation of library pool will increase these artifacts. I also would expect to see more fusion with PCR-free libraries as the proportion of fragments without adapters in both end are higher in comparison to PCR amplified libraries.
I hope not! That would also tend to create massive amounts of chimerism due to repetitive elements in genomic DNA, for instance. Hopefully whomever designed ExAmp would not allow low-stringency interactions of the sort you describe for the KAPA "HiFi" polymerase to result in this sort of (undesired) recombination.

I'm not really following why we need to posit either low stringency annealing event nor actual ligations (as the mechanism described in Brian's post would require) to explain index hopping. If there is any amplification occurring anywhere but tethered to the surface of the flowcell, then unincorporated adapter oligos could anneal and be extended, creating a "cross-over event" that would generate an index hopped library molecule. If that molecule seeded a cluster, then we would have an index hop.

--
Phillip
pmiguel is offline   Reply With Quote
Old 08-02-2017, 01:54 AM   #111
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,035
Default

Illumina’s white paper on index hopping https://www.illumina.com/content/dam...inkId=36607862 shows that adding adapters not used in library prep increases index hopping with increased spike in of adapters. These adapters will be dissociated to single stranded oligos during denaturing. The oligos will be complementary to adapted library fragments in a maximum stretch of ~30 nt just before the adapter index sequences which indicates that index hopping can occur when relatively large overhang is present. I am not sure about how many bases need to anneal for an extension event but giving high processivity of ExAmp it might be a short stretch.

Chimerism will happen if the 3’ end of a fragment anneals to other fragments and is extended so fragments with adapters at both ends even with high similarity will not cause cause fusion. For my hypothesized mechanism then PCR-free libraries will be more prone to index hopping and chimerism. Indeed, Illumina data https://www.illumina.com/science/edu...x-hopping.html indicates higher index hopping for PCR-free libraries but they have not investigated chimerism events.

Index hopping is possible to happen on the flow cell tethered fragments but they would contribute if they seed another well on the flow cell. Wells with chimeras and multiple indices will have low quality sequences and more likely will be filtered in read processing steps.
nucacidhunter is offline   Reply With Quote
Old 08-03-2017, 03:22 AM   #112
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,218
Default

Quote:
Originally Posted by nucacidhunter View Post
Illumina’s white paper on index hopping https://www.illumina.com/content/dam...inkId=36607862 shows that adding adapters not used in library prep increases index hopping with increased spike in of adapters.
Yes.
Quote:
Originally Posted by nucacidhunter View Post
These adapters will be dissociated to single stranded oligos during denaturing.
I agree.
Quote:
Originally Posted by nucacidhunter View Post
The oligos will be complementary to adapted library fragments in a maximum stretch of ~30 nt just before the adapter index sequences which indicates that index hopping can occur when relatively large overhang is present.
Yes.
Quote:
Originally Posted by nucacidhunter View Post
I am not sure about how many bases need to anneal for an extension event but giving high processivity of ExAmp it might be a short stretch.
"processivity" isn't a measure of how short an annealed segment is necessary for a polymerase to extend. Its a measure of how long a polymerase will extend.
I don't doubt that many polymerases can extend from an oligo annealed over just a handful of bases. But an oligo annealed via a very short area of complementarity will do so with little stability unless the conditions of hybridization are such that they allow such this interaction. For example high salt concentrations can shield the negative phosphate backbone charges and thereby dampen that force which tends to tear the strands apart from one another.
Of course it is possible to lower the stringency of primer annealing of an amplification to allow just a few bases of homology to prime an extension event. But I can't think of any reason to do so during cluster formation -- it would allow various types of undesired mis-priming events that would be very undesirable. So I would doubt that Illumina would use such conditions.

--
Phillip
pmiguel is offline   Reply With Quote
Old 08-03-2017, 03:51 AM   #113
nucacidhunter
Senior Member
 
Location: Iran

Join Date: Jan 2013
Posts: 1,035
Default

Maybe I am not using the correct terminology but by processivity I meant the polymerisation speed. For instance, some brands will extend a primer 1kb/min while others can do 3kb/min. Speedy polymerases specially with activity at suboptimal temperatures tend to extend less complimentary primers because the extension progresses before weakly bound unstable primers dissociates.
nucacidhunter is offline   Reply With Quote
Reply

Tags
illumina, novaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:41 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO