SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Ion Torrent (http://seqanswers.com/forums/forumdisplay.php?f=40)
-   -   Correcting for deamination in variant calling (http://seqanswers.com/forums/showthread.php?t=87517)

geejaytee 02-05-2019 05:55 AM

Correcting for deamination in variant calling
 
1 Attachment(s)
Hiya,

I'm calling variants using the TorrentSuite on DNA which has been sequenced from formalin-fixed paraffin-embedded tissue. This has a major issue in that without addition of uracil-N-glycosylase, some of the Cs in the original DNA are deaminated to uracil, which upon sequencing and calling variants can show up as mutations, either as C>T transitions or G>A (from the opposite strand, due to PCR in the library prep). I do not have any idea how long these samples were stored without UNG before sequencing.

TVC gives a deamination metric (essentially, sum of C>T and G>A variants over all variants called), and for our samples, the highest value seen is ~0.92. Naively postprocessing the variants show that for these samples, C>T/T>C transitions overwhelm the remaining variants among my samples.

My question is this, given the IonTorrent variant calling pipeline (sequencing > BAM file > TVC > VCF file with deamination statistic), is there:

a) a way of correcting the output VCF, or
b) a set of filters to use in bcftools,

to reduce this effect on the samples?

My use case is this: these are medical samples, which have been inspected by a pathologist (hence the FFPE treatment), and I want to determine which variants are predictive* of outcome, hence I have two potentially contradictory goals: reduce false positives and capture the rarer variants which may hold predictive power.

Thanks!

*It's a retrospective trial, so 'predictive' in the sense of which variants correspond with outcomes

geejaytee 02-20-2019 06:28 AM

They do not need correcting - as the deamination (C>T and G>A) damage is random, the predicted allele frequencies for these (false) calls will be small (<0.25, say), and will be of low quality. A simple filter on quality and allele fraction can then be used to filter these out as they cluster separately to the true variants.


All times are GMT -8. The time now is 09:39 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.