SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Ion Torrent vs MiSeq vs GS Junior razibus General 39 11-08-2015 12:45 PM
Ion Torrent $1000 Genome!? Benchtop Ion Proton Sequencer aeonsim Ion Torrent 88 10-28-2012 04:50 AM
Ion Torrent/MiSeq Ordering riegs General 3 12-28-2011 07:30 AM
Ion Torrent vs MiSeq & GS FLX+ Kanak Vaidya Ion Torrent 8 08-18-2011 12:26 PM
Ion Torrent ups the ante against MiSeq -- Omics Omics Blog ECO Ion Torrent 5 03-06-2011 02:41 PM

Reply
 
Thread Tools
Old 01-31-2012, 09:51 PM   #1
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default Ion Torrent claims of MiSeq showing post-homopolymer substitution errors

I wanted to hopefully start some discussion here of perhaps the most interesting thing going on in the sequencing marketing world this week (while we wait for Roche to up its bid for ILMN or bail ).

Ion Torrent posted an analysis of public MiSeq data on the Ion Community, and is presenting an analysis that describes a "clear systematic bias within MiSeq® data". A choice quote is below (PDF export of the post is attached...you know, for openness):

Quote:
"These substitution errors often fall to the last base of a homopolymer region - based on the direction of the read. For example, in a stretch of three G bases, the fourth base is often erroneously called a G. This strand-specific pattern is wide spread, and explains 49.9% and 51.8% of MiSeq® substitution errors overall in DH10B and K12, respectively. This dominant error profile that can be found so frequently next to homopolymer regions suggests a clear systematic bias within MiSeq® data.
Keith Robison and Monkol Lek have taken a look at the claims on their respective blogs.
Attached Files
File Type: pdf 2279.pdf (148.4 KB, 217 views)
ECO is offline   Reply With Quote
Old 01-31-2012, 10:54 PM   #2
koadman
Member
 
Location: Sydney, Australia

Join Date: May 2010
Posts: 65
Default

I love it. The good folks at Life Tech may have in fact helped to make analysis pipelines for MiSeq better by publicizing a bias that is probably much more fixable than the homopolymer issues on their own platform. Keep up the good work Ion.
koadman is offline   Reply With Quote
Old 02-01-2012, 02:36 AM   #3
TonyBrooks
Senior Member
 
Location: London

Join Date: Jun 2009
Posts: 298
Default

Surely as this is strand specific it's not too big a problem. You just need to be sure that any SNP is visible in both forward and reverse reads. If it's only seen in reads from one direction, then you should ignore it, treat it with caution or at least give it a really low mappability score) - something I think most aligners do (correct me if I'm wrong).

The only problem is if you had a single base flanked by homopolymers in both directions. Then the base would be miscalled on both strands.
TonyBrooks is offline   Reply With Quote
Old 02-01-2012, 02:59 AM   #4
ulz_peter
Senior Member
 
Location: Graz, Austria

Join Date: Feb 2010
Posts: 219
Default

Is someone else also getting tired of companies trying to prove the weaknesses of the opponent rather than focussing on their own system?
ulz_peter is offline   Reply With Quote
Old 02-01-2012, 09:02 AM   #5
sinaian
Junior Member
 
Location: Boston

Join Date: Jan 2011
Posts: 4
Default

So ONLY NOW someone finally realizes weakness of opponent is not a proper subject? How convenient is the timing ...

BTW, trading sensitivity for specificity is always a great solution.
sinaian is offline   Reply With Quote
Old 02-01-2012, 11:46 AM   #6
SeqAA
Guest
 

Posts: n/a
Default

I wonder if this is related to the fast chemistry times of Illumina's newest platforms? Seems odd such a prevalent error profile would go missed.
  Reply With Quote
Old 02-01-2012, 06:14 PM   #7
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Let me discard the previous post.

IonTorrent is finding something real. However, I think this is not caused by homopolymer run, at least not mainly caused by that, but by the "GGC" and/or the invert repeat artifact [PMID:21576222]. This region is particularly enriched with GGC on both forward and backward strands. In addition, the screenshot is exaggerating the Illumina problem a little bit: they disabled shading in IGV; the majority of mismatches have quality below 10 and are barely visible under the IGV default setting. Some mismatches do get Q20 recurrently, which is worrying.

Last edited by lh3; 02-01-2012 at 06:54 PM.
lh3 is offline   Reply With Quote
Old 04-29-2012, 09:17 PM   #8
snetmcom
Senior Member
 
Location: USA

Join Date: Oct 2008
Posts: 158
Default

Quote:
Originally Posted by sinaian View Post
So ONLY NOW someone finally realizes weakness of opponent is not a proper subject? How convenient is the timing ...

BTW, trading sensitivity for specificity is always a great solution.
just poking through their documentation, there are several publications that have found this before.
snetmcom is offline   Reply With Quote
Old 04-30-2012, 03:58 AM   #9
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Quote:
Originally Posted by snetmcom View Post
just poking through their documentation, there are several publications that have found this before.
Yes, I think MIRA creator, Bastien Chevreux, noticed it first -- and changed MIRA to compensate for the Illumina GGCxG issue. Bizarre Illumina has not fixed it themselves, but there are a handful of issues Illumina seems blind to.

--
Phillip
pmiguel is offline   Reply With Quote
Old 05-20-2012, 09:16 PM   #10
alanwan
Junior Member
 
Location: BEIJING

Join Date: Sep 2008
Posts: 4
Default

The system bias indeed exists. But it is usually very small - no more than 1/1000 detected SNVs are caused by system errors. Therefore few people realize it.

However it is fatal to rare disease causal novel SNP detection, because system errors occur randomly to the whole genome, and since the known SNPs occupy only 1/100 (db135 ~30M/3G) of the genome base positions, most of the errors SNVs exist in novel sites. That leads to a high false positive rate in your novel SNPs.

This problem could be far more worse if you want to find common novel SNPs in size>=3 population samples. Actually we found a terrible FPR (>98%) in detected common novel SNVs of a whole exome sequencing project (family samples, size=3, sequence generated by one GAII) in 2010. However, it is important to note that not all our Illumina sequence data have such a high error rate.

In my observation, the proceeding homopolymer leads to most of the false positives,while GGC problem is light. I think it may depend on sample properties and other factors.

As you guys may already find, there have been many articles introducing methods to solve the system bias problems of the NGS instruments, such as GATK variants calibration, VarScan, CRISP, SERVIC4E, and etc. Unfortunately there is no common conclusion that which method provides the best solution. No offense, I personally had bad experience with GATK's old versions, which crashed again and again and was too picky to my BAM files exported by other aligner. I did not try other tools yet, and I am still using my own scripts to filter the false positives.
alanwan is offline   Reply With Quote
Old 05-21-2012, 01:56 AM   #11
james hadfield
Moderator
Cambridge, UK
Community Forum
 
Location: Cambridge, UK

Join Date: Feb 2008
Posts: 221
Default

Is
Quote:
Originally Posted by sinaian View Post
weakness of opponent
a popular strategy in the US at the moment due to your 57th presidential election?

Bashing opponents always makes jucier headlines than demonstrating minor improvments to your own system. I would very much prefer to hear Ion discussing the very real improvments they have made. The technology has raced forward as fast as we hoped it would.
james hadfield is offline   Reply With Quote
Old 05-21-2012, 11:51 AM   #12
sinaian
Junior Member
 
Location: Boston

Join Date: Jan 2011
Posts: 4
Default

Quote:
Originally Posted by james hadfield View Post
Bashing opponents always makes jucier headlines than demonstrating minor improvments to your own system. I would very much prefer to hear Ion discussing the very real improvments they have made. The technology has raced forward as fast as we hoped it would.
Fully agreed. But it is just intereting to compare the atmosphere when one party came out bashing the other, versus when the opponenent actually answers back.
sinaian is offline   Reply With Quote
Old 05-22-2012, 04:03 AM   #13
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,317
Default

Can anyone verify that this is the old "GGCxG" issue?

If so, I have my doubts that Illumina will address the issue on the basis of LifeTech pointing it out. Seems either to be firmly in their corporate blind spot or an intractable issue.

--
Phillip
pmiguel is offline   Reply With Quote
Old 05-22-2012, 04:31 AM   #14
alanwan
Junior Member
 
Location: BEIJING

Join Date: Sep 2008
Posts: 4
Default

Quote:
Originally Posted by pmiguel View Post
Can anyone verify that this is the old "GGCxG" issue?

If so, I have my doubts that Illumina will address the issue on the basis of LifeTech pointing it out. Seems either to be firmly in their corporate blind spot or an intractable issue.

--
Phillip
This system bias problem probably can never be completely solved. But I believe new algorithms will help distinguish the error calls.
alanwan is offline   Reply With Quote
Reply

Tags
ion torrent, marketing, miseq, pgm

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO