SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Ion Torrent



Similar Threads
Thread Thread Starter Forum Replies Last Post
Difference between Ion torrent proton and Ion PGM Bionerd Ion Torrent 7 09-30-2015 02:55 PM
Amplification-Ion torrent MonaE Introductions 0 07-18-2013 02:57 PM
Ion Torrent $1000 Genome!? Benchtop Ion Proton Sequencer aeonsim Ion Torrent 88 10-28-2012 05:50 AM
ion torrent herrroaa Introductions 5 07-25-2011 06:36 AM

Reply
 
Thread Tools
Old 11-16-2015, 12:09 PM   #1
skbrimer
Member
 
Location: OP Kansas

Join Date: Mar 2014
Posts: 53
Post Ion torrent error correction

I asked this question on the Ion Community a couple of months ago without an answer or reply so I thought I would try here.

Over the last several years Ion Torrent has improved it chemistry and base-calling algorithm and I'm wondering if error-correction is still advisable for ion data or not?

I'm afraid that if ion has already "corrected" the data in the single processing step if I would be introducing error by correcting it a second time.
skbrimer is offline   Reply With Quote
Old 11-16-2015, 01:01 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

There's no problem with error-correcting data multiple times. But if you error-correct it, be sure to use a program that can tolerate indel-type errors.
Brian Bushnell is offline   Reply With Quote
Old 11-16-2015, 01:26 PM   #3
skbrimer
Member
 
Location: OP Kansas

Join Date: Mar 2014
Posts: 53
Default

Thanks Brian

Are there time when you would not want to error correct?
skbrimer is offline   Reply With Quote
Old 11-16-2015, 01:31 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

You shouldn't error-correct if you are looking for rare variants (much less than the 50% ratio of a normal heterozygous diploid variant), or are doing amplicon sequencing, or are looking at tumor samples, or you have low coverage. Also, error-correction won't help much with platform-specific errors (like being unable to correctly determine the length of a long homopolymer), just with random errors.

If you have a reference, you can map before and after error-correction, and look at the error rates, to make sure error-correction improved things.
Brian Bushnell is offline   Reply With Quote
Old 11-16-2015, 01:50 PM   #5
skbrimer
Member
 
Location: OP Kansas

Join Date: Mar 2014
Posts: 53
Default

Sooo... how does one evaluate an error rate with a reference? Is it just a comparison of the vcf files?

Also why would it be bad to error correct in those situations, I imagine that it will have to due with "correcting" away an actual variant but a variant would still have to be present at a rate higher than the machine's error rate to be called with an confidance right? i.e. if you have a 1% error rate and 1000x coverage you could not call anything less than 10X right?
skbrimer is offline   Reply With Quote
Old 11-16-2015, 01:56 PM   #6
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Map to the reference with BBMap, like this:

bbmap.sh ref=reference.fa in=reads.fq out=mapped.sam mhist=mhist.txt ehist=ehist.txt qhist=qhist.txt indelhist=indelhist.txt

BBMap will print useful statistics to the screen:
Code:
Read 1 data:            pct reads       num reads       pct bases          num bases

mapped:                  99.6100%            9961        99.6100%            1494150
unambiguous:             97.8900%            9789        97.8900%            1468350
ambiguous:                1.7200%             172         1.7200%              25800
low-Q discards:           0.0000%               0         0.0000%                  0

perfect best site:        1.7500%             175         1.7500%              26250
semiperfect site:         1.7500%             175         1.7500%              26250

Match Rate:                   NA               NA        61.1359%            1409105
Error Rate:              96.0596%            9605        38.5408%             888317
Sub Rate:                87.2787%            8727         2.2734%              52398
Del Rate:                43.4543%            4345        35.1743%             810722
Ins Rate:                48.9249%            4892         1.0932%              25197
N Rate:                  50.2050%            5020         0.3232%               7450
....and you can also plot the mhist or other histograms, for more details.

Quote:
Originally Posted by skbrimer View Post
Also why would it be bad to error correct in those situations, I imagine that it will have to due with "correcting" away an actual variant but a variant would still have to be present at a rate higher than the machine's error rate to be called with an confidance right? i.e. if you have a 1% error rate and 1000x coverage you could not call anything less than 10X right?
Error correction relies on high depth. With low depth it just doesn't work, and low depth of a variant compared to the reference will lead to that variant getting corrected away.

Last edited by Brian Bushnell; 11-16-2015 at 01:58 PM.
Brian Bushnell is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:31 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO