Lets say I have processed raw reads from a tumor-normal paired exome experiment and made them fit for mutation calling. I have two bam files that I feed into a mutation caller and since its an exome experiment, I limit the variant calls to mutations limited to exons + 10 bases only by generating a .bed file of refgenes from the UCSC table browser.
Now, theoretically all the mutation calls made by the caller are exonic or splicing.
But when I run these calls through an annotation software and annotate it against a refgene set (tried both snpEff and Annovar (with annovar I used the default hg19 set)), only approximately 65%-80% of the calls are exonic or splicing. The rest are annotated as intronic, upstream, downstream and a zillion other things..
I have been trying to think of an explanation as to why. But I just cant.
Has anybody here noticed this before? Is there an explanation as to why this is happening?
Thank you.
Shyam.
PS: Its not a problem with the mutation caller either; I have tried 2 of them..
Now, theoretically all the mutation calls made by the caller are exonic or splicing.
But when I run these calls through an annotation software and annotate it against a refgene set (tried both snpEff and Annovar (with annovar I used the default hg19 set)), only approximately 65%-80% of the calls are exonic or splicing. The rest are annotated as intronic, upstream, downstream and a zillion other things..
I have been trying to think of an explanation as to why. But I just cant.
Has anybody here noticed this before? Is there an explanation as to why this is happening?
Thank you.
Shyam.
PS: Its not a problem with the mutation caller either; I have tried 2 of them..
Comment